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1 . 0  INTRODUCTION 


In  multichannel  identification  problems  the  outputs  of 
multiple  channels  (or  sensors)  are  available,  and  it  is  desired  to 
identify  the  parameters  of  an  analytical  model  to  represent  the 
phenomena  being  observed  via  the  channel  outputs.  Similarly,  in 
multichannel  detection  problems  the  outputs  of  multiple  channels 
are  available,  and  it  is  desired  to  determine  the  presence  (or 
absence)  of  a  desired  signal  component  in  the  channel  data.  In 
the  combined  problem  of  multichannel  identification  and  detection 
a  model  is  estimated  for  the  phenomena  being  observed  via  the 
channel  outputs,  and  the  identified  model  is  used  to  facilitate 
the  detection  of  a  desired  signal  in  the  channel  output  data. 
Multichannel  identification  and  detection  is  thus  referred  to  also 
as  model-based  multichannel  detection.  In  all  of  these  problems 
the  channel  data  is  available  simultaneously  over  many  channels  of 
the  same  type,  or  over  many  distinct  channels  (each  channel 
corresponding  to  a  different  sensor  type) . 

This  report  is  a  summary  of  the  work  carried  out  in  this 
program.  Specifically,  the  development  of  state  space  algorithms 
for  model-baseu  multichannel  deseucion  in  the  context  of 
surveillance  radar  system  applications  is  addressed.  In 
surveillance  radar  systems  (radar  arrays)  the  channels  correspond 
to  separate  antenna  apertures  (or  elements  of  a  single  aperture 
array)  .  The  desired  signal  may  or  may  not  be  pu.esent  in  the 
channel  output  data  at  any  given  time.  The  data  in  each  channel 
generally  includes  noise  (broadband  interference)  as  well  as 
"clutter"  (narrowband  interference),  with  low  s ignal-to-clutter 
ratio  and,  possibly,  low  signal-to-noise  ratio  also.  Model-based 
detection  methods  must  discriminate  between  the  condition  of 
target  embedded  in  clutter  and  noise,  and  the  condition  of  clutter 
and  noise  only. 
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Figure  1-1  illustrates  a  radar  array  system  consisting  of 
multiple  subarrays  or  array  elements.  The  output  of  each  subarray 
(or  each  individual  array  element)  is  a  complex-valued,  scalar, 
digital  sequence,  denoted  as  {Xj(n)} .  The  collection  of  the  J  scalar 

sequences  is  arranged  into  a  J-dimens ional  vector,  U(n)}  ,  which  is 
input  to  a  multichannel  processor  (not  shown  in  the  figure) . 


Channel  No.  1 


Channel  No.  J 


Figure  1-1.  -  Radar  array  with  J  subarrays  or  individual  elements. 


In  this  study  the  multivariate  (multiple  input,  multiple 
output)  state  space  model  class  was  adopted  to  represent  the 
multichannel  radar  data,  and  advanced  system  identification 
techniques  were  applied  to  estimate  the  model  parameters.  The 
modeling  of  the  complex-valued  pre-processed  radar  signals  for 
multichannel  detection  using  the  state  space  model  class  is  one  of 
the  contributions  of  this  work.  State  space  models  have  been  used 
in  the  context  of  target  tracking  (where  the  detected  radar  signal 
is  processed  further  to  estimate  a  trajectory)  and  for  the 
determination  of  weights  in  antenna  array  sidelobe  canceling  and 
related  problems,  but  not  for  multichannel  detection.  Model-based 
detection  has  been  carried  out  using  the  more-restricted  time 
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series  models  v  ^chels,  1991;  Metford  and  Haykin,  1985),  which  are 
included  wicnin  the  class  of  state  space  models  and  can  be 
represented  as  such. 

The  methodology  formulated  in  this  study  is  based  on  a  state 
space  identification  algorithm  developed  by  Desai  et  al.  (1985), 
which  in  turn  is  based  on  the  stochastic  realization  concepts 
formulated  by  Akaike  {1974  ;  1975)  and  Faurre  (1976)  .  This 
identification  algorithm  has  several  unique  features.  Foremost 
among  these,  the  algorithm  identifies  the  model  parameters  in  the 
innovations  representation.  As  a  result,  a  steady-state  Kalman 
filter  design  is  obtained  as  an  inherent  by-product  of  the 
algorithm,  without  having  to  solve  a  nonlinear  Riccati  equation. 
Implementation  of  the  algorithm  is  carried  out  using  the  singular 
value  decomposition  (SVD) ,  which  is  a  stable  numerical  technique. 

An  important  distinction  in  the  context  of  radar  system 
applications  is  that  the  vector  random  processes  which  represent 
the  channel  data  are  complex-valued  processes  in  most  cases.  Most 
time  series  techniques  and  models  have  been  formulated  for  com.plex 
as  well  as  real  processes.  The  same,  however,  cannot  be  said 
about  state-space  techniques;  state-space  methods  and  results 
available  in  the  literature  have  been  defined  almost  exclusively 
for  the  case  of  real-valued  processes,  including  the  stochastic 
realization  algorithms  adopted  herein.  In  t.hls  study  the 
stochastic  realization  formulations  and  algorithms  were  extended 
to  the  case  of  complex-valued  processes,  which  is  the  formulation 
presented  in  this  report. 

A  computer  simulation  was  generated  as  part  of  this  program 
to  validate  the  methodology  and  the  algorithms,  and  to  carry  out 
limited  simulation-based  analyses.  This  software  was  exercised 
with  simulated  multichannel  data  generated  at  RL,  and  the  modeling 
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and  identification  results  compare  favorably  with  the  results 
obtained  at  RL  using  auto-regressive  models. 

In  summary,  the  analytical  and  simulation  results  obtained  in 
this  program  indicate  that  the  SSC  algorithm  and  methodology  for 
model-based  multichannel  detection  has  the  potential  to  result  in 
significant  advances  for  radar  system  applications. 

1  .  1  Notation 

Vector  variables  are  denoted  by  underscored  lower-case 
letters  (including  Greek  letters) .  Matrices  are  denoted  by  upper¬ 
case  letters  (including  Greek  letters) .  Some  scalars  (such  as  the 
order  of  the  state  variable  model)  are  denoted  also  by  upper-case 
letters.  Vector  spaces  are  denoted  by  upper-case  script  letters, 
such  as  1/.  The  expectation  operator  is  denoted  as  EH;  superscript 

T  and  H  are  used  to  denote  the  matrix  and  vector  transpose  and  the 
Hermitian  transpose  operators,  respectively;  and  an  asterisk  (*) 
denotes  the  complex  conjugate  operator.  1)^  denotes  an  M- 
dimensional  'identity  matrix,  denotes  an  NxJ  null  (zero) 

matrix,  denotes  an  M-dimens ional  (square)  null  matrix,  and 

denotes  an  M-dimensional  zero  vector.  iA|  denotes  the  determinant 
of  matrix  A;  A'^  denotes  the  inverse  of  matrix  A;  A'*’  denotes  the 
pseudoinverse  of  A;  rank  (A)  denotes  the  rank  of  A;  A(i,j)  and  a|j  are 
both  used  to  denote  the  (i,j)th  element  of  matrix  A;  and 
denotes  the  orthogonal  projection  of  onto  ^  caret  {'')  over 

a  variable  denotes  an  estimate  of  the  variable,  a  bar  (-)  over  a 
variable  is  used  to  represent  the  mean  of  the  variable,  and  In (3) 
denotes  the  natural  logarithm  of  a.  The  symbol  X  denotes  "is 
orthogonal  to;"  0  denotes  the  direct  sum  of  vector  spaces;  V 
denotes  "for  all;"  and  e  denotes  "is  an  element  of." 
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Where  possible,  the  symbols  used  herein  to  represent 
variables  match  the  symbols  used  by  Michels  (1991)  to  facilitate 
enhancing  the  software  available  at  'lome  Laboratory  (RL)  with  the 
techniques  developed  in  this  program.  This  philosophy  forces  the 
use  of  non-standard  symbols  to  represent  the  parameters  of  a  state 
variable  model.  Of  course,  notational  convention  should  not  be  a 
major  issue  provided  ail  symbols  are  defined  appropriately. 
However,  it  is  important  to  mention  this  poinr  in  order  to  avoid 
possible  confusion  on  the  part  of  the  reader. 

1  •  2  Report _ QyexYifiTg 

An  introduction  to  the  model-based  multichannel  detection 
problem  is  presented  in  Section  2.0.  This  section  includes  also 
the  definition  of  the  state  space  model  class  and  several  related 
concepts,  including  the  backward  model  associated  with  a  forward 
model,  and  the  innovations  representation  for  a  random  process. 
The  parameter  identification  algorithm  is  presented  in  Section 
3.0,  using  an  approach  which  differs  from  the  approach  of  Desai  et 
al.  (1985)  .  ■  This  alternative  approach  given  here  is  simple,  and 
enhances  intuition.  As  mentioned  earlier,  this  algorithm  is  the 
backbone  of  the  Scientific  Studies  Corporation  (SSC)  multichannel 
detection  methodology  presented  herein.  Kalman  filtering  of  the 
channel  data  to  generate  the  innovations  sequence  is  discussed  in 
Section  4.0.  The  innovations  sequence  is  fed  to  a  likelihood 
ratio  detector  which  generates  the  detection  decision,  as 
described  in  Section  5.0.  A  discussion  of  the  software  generated 
in  the  program  is  presented  in  Section  6.0,  along  with  several 
simulation  results  which  demonstrate  the  signal  discrimination 
capability  of  the  algorithm.  Section  7.0  includes  the  main 
conclusions  and  recommendations  borne  out  of  this  study. 
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The  three  appendices  provide  background  material  in  a  form 
which  is  not  readily  available  elsewhere.  Appendix  A  presents  a 
methodology  for  generating  the  state  space  representation  of  three 
conventional  time  series  models  (moving-average,  auto-regressive, 
and  auto-regressive  moving-average) .  Appendix  B  presents  a 
summary  of  relevant  aspects  of  deterministic  realization  theory 
and  algorithms.  The  extension  of  canonical  correlations  to 
complex-valued  variables  is  presented  in  Appendix  C.  This  is  an 
important  result  for  Section  3.0. 
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2  .  0  MODEL-BASED  MULTICHANNEL  DETECTION 


The  model-based  approach  to  multichannel  detection  involves 
processing  the  channel  data  to  identify  the  parameters  of  a  model 
for  the  multichannel  system,  and  determination  of  a  detection 
decision  utilizing  the  identified  parameters  to  filter  the  channel 
data.  Model  parameters  can  be  identified  on-line,  as  the  channel 
data  is  received  and  processed.  Alternatively,  the  model 
parameters  can  be  identified  off-line  for  various  conditions  and 
stored  in  the  processor  memory  to  be  accessed  in  real-time  as 
required . 

There  are  two  general  classes  of  linear  parametric  models  for 
vector  random  processes:  time  series  models  and  state  space 
models.  Time  series  models  include  moving-average  (.MA)  models, 
auto-regressive  (AR)  models,  and  auto-regressive  moving-average 
(ARMA)  models.  State  space  models  are  more  general  than  time 
series  models;  in  fact,  MA,  AR,  and  ARMA  models  can  be  represented 
by  state  space  models  (Appendix  A)  .  In  the  state  space 
literature,  .the  determination  of  the  model  parameters  based  on 
output  data  (and,  sometimes,  input  data  also)  is  referred  to  as  a 
stochastic  identification  or  a  stochastic  realization  problem. 

Time  series  models  have  been  applied  to  the  multichannel 
detection  problem,  and  the  performance  results  obtained  provide 
encouragement  for  further  researc.h  (see,  for  e.xample,  Michels, 
1991,  and  the  references  therein)  .  The  results  obtained  by 
Michels  (1991)  assume  that  the  multichannel  output  process  can  be 
modeled  as  a  vector  AR  process.  Given  the  generality  of  state- 
space  models  and  the  wealth  of  results  available  in  the  state- 
space  literature,  the  state  space  model  class  was  selected  in  this 
program  to  represent  the  multichannel  signals  in  the  model-based 
multichannel  detection  problem  for  radar  systems. 


In  the  case  of  time  series  models,  two  types  of  model 
parameter  estimation  algorithms  have  been  established  in  the 
literature:  (a)  algorithms  which  operate  on  channel  output 
correlation  matrices,  such  as  the  extended  Levinson  algorithm 
(Anderson  and  Moore,  19'’9)  ,  and  (b)  algorithms  which  operate  on 
the  channel  output  data  directly  (without  the  need  to  compute 
channel  output  correlation  matrices) ,  such  as  the  Levinson- 
Wiggins-Robinson  algorithm  (Wiggins  and  Robinson,  1965)  and  the 
Strand-Nuttall  algorithm  (Strand,  1977;  Nuttall,  1976). 

The  state-space  parameter  identification  algorithm  adopted 
for  this  study  operates  on  channel  output  correlation  matrices. 
The  algorithm  formulation  is  due  to  Desai  et  al .  (1985),  and  is 
based  on  the  stochastic  realization  concepts  developed  by  Akai)ce 
(1974,  1975)  and  Faurre  (1976).  Implementation  of  the  algorithm 
is  carried  out  via  the  singular  value  decomposition.  This 
algorithm  has  several  attractive  features,  including  direct 
estimation  of  the  parameters  for  a  Kalman  filter,  without  the 
requirement  to  solve  a  nonlinear  Riccati  equation. 

2  •  1  Multichannel _ Detection 

Detection  problems  in  the  context  of  radar  systems  can  be 
postulated  as  hypothesis  testing  problems,  where  a  choice  has  tc 
be  made  among  two  or  more  hypotheses.  The  detection  problems 
addressed  in  this  report  involve  the  following  two  hypotheses: 

Hq:  Desired  signal  is  absent 

Hi :  Desired  signal  is  present 


8 


Hq  is  referred  to  as  the  null  hypothesis,  and  Hq  is  the  alternative 
hypothesis  .  The  model-based  approach  to  the  multichannel 
detection  problem  is  couched  on  the  assumption  that  the  vector 
random  process  at  the  output  of  the  channels  can  be  represented  as 
the  output  of  a  linear  system  (filter)  under  each  of  the  two 
hypotheses,  and  that  a  unique  parametric  model  corresponds  to  each 
hypothesis.  Furthermore,  the  two  parametric  models  (one  for  each 
of  the  two  hypotheses)  must  be  sufficiently  different  to  allow 
selection  of  the  correct  hypothesis  by  the  evaluation  of  measures 
that  are  sensitive  to  those  differences. 

A  particular  measure  that  has  produced  robust  experimental 
results  in  the  model-based  detection  context  (Metford  and  Haykin, 
1985)  is  the  log-likelihood  ratio  (LLR)  test.  This  test  is  the 
result  of  solving  the  hypothesis  testing  problem  using  the  Neyman- 
Pearson  criterion.  The  LLR  test  in  the  context  of  model-based 
detection  is  calculated  using  the  innovations  sequence  at  the 
output  of  each  of  the  two  linear  filters.  This  presents  practical 
and  implementation  advantages. 

Figure  2-1  illustrates  the  architecture  of  an  on-line 
' nnovations-based  multichannel  detector.  In  the  case  of  a  radar 
array  system,  each  of  J  radar  receiver  channels  collects  the 
electromagnetic  energy  arriving  at  its  aperture,  and  processes  it 
to  generate  a  discrete-time  random  sequence,  denoted  as  {Xi(n)}, 
which  contains  the  desired  information.  The  J  random  sequences 
{Xi(n)}  are  represented  in  vector  form  as  {x(n)} .  Michels  (1991)  has 
formulated  the  binary  detection  problem  for  multichannel  systems. 
Specifically,  the  null  hypothesis,  Hg,  corresponds  to  the  case  of 

clutter  and  noise  present  in  the  observation  process  U(n)},  and  the 
alternative  hypothesis.  Hi,  corresponds  to  the  case  of  signal, 
clutter,  and  noise  present  in  the  observation  process  U{n)} .  That 
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is,  the  detection  decision  must  be  made  between  the  following  two 
models , 


(2-la) 

Ho: 

i(n)  =  £(n)  +  yw{r)) 

n  >  n, 

(2-lb) 

Hr. 

i(n)  =  §(n)  +  £(n)  +  w(n) 

n  >  n, 

where  n©  denotes  the  initial  observation  time,  {£(n)}  denotes  the 
clutter  process,  {W.(n)}  denotes  all  the  array  channel  noise 
processes,  and  {S(n)}  denotes  the  desired  signal  (target)  process. 


Figure  2-1.  Innovat ions-based  multichannel  detector  with  on-line 

parameter  identification. 

In  the  model-based  approach  pursued  herein,  a  distinct  state 
variable  model  is  assoc;'-.'  cd  with  each  of  the  two  hypotheses,  and 
a  Kalman  filter  is  designed  for  each  model.  Each  Kalman  filter 
processes  the  observation  sequence  {x(n)}  to  generate  a  vector 
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innovations  sequence;  {£(n  |  Hq)}  denotes  the  innovations  sequence  at 
the  output  of  the  null  hypothesis  filter,  and  denotes  the 
innovations  sequence  at  the  output  of  the  alternative  hypothesis 
filter.  These  innovations  sequences  are  used  in  a  likelihood 
ratio  test  with  a  pre-stored  threshold  to  carry  out  the  detection 
decision . 

As  indicated  in  the  detection  configuration  of  Figure  2-1, 
the  two  filters  can  be  determined  in  real-time  by  processing  the 
observation  sequence  for  a  prescribed  time  interval.  This 
approach  provides  the  most  adaptability,  but  may  present  a  large 
computational  burden  for  some  applications.  It  also  presents 
conceptual  challenges,  such  as  real-time  determination  of  model 
order  for  each  of  the  two  filters.  Alternatively,  the  filter 
design  can  be  carried  out  off-line  for  each  of  the  two  hypotheses, 
and  the  resulting  filter  design  implemented  in  the  real-time 
configuration'.  This  alternative  approach  is  less  robust  to 
changes  in  the  operational  environment,  but  requires  a  simpler 
processor  architecture,  which  is  important  in  many  real-time 
applications'.  Careful  design  of  the  filters  off-line  using 
adequate  simulated  and  real  data  can  lead  to  acceptable 
performance.  Also,  many  pairs  of  fixed  filters  may  be  designed  to 
cover  distinct  operational  conditions.  The  filter  for  the 
alternative  hypothesis  will  be  of  higher  order  than  the  filter  for 
the  null  hypothesis  because  the  observation  process  for  the 
alternative  hypothesis  has  more  information  (the  signal 
component) . 

Michels  (1991)  has  developed  a  likelihood  ratio  calculation 
and  detection  decision  model  which  are  compatible  with  the 
formulation  adopted  herein.  Both  of  these  capabilities  are 
available  at  RL,  and,  where  appropriate,  the  methodology  presented 
in  this  report  is  compatible  with  these  capabilities. 


2 . 2 


The  class  of  multiple-input,  multiple-output  state  variable 
models  can  represent  effectively  the  channel  output  process  for 
radar  applications.  Consider  a  discrete-time,  stationary, 
complex-valued,  zero-mean,  Gaussian  random  process  {i(n)}  defined  as 
the  output  of  the  following  state  space  model  representation  for 
the  system  giving  rise  to  the  observed  process: 

(2-2a)  y(n+1)  =  Fy(n)  +  Gii(n)  n  >  no 

(2-2b)  x(n)  =  H'^y(n)  +  D^^(n)  n  >  Oq 

(2-20  EWno)l=QN 

(2 -2d)  E[y(no)/(no)]  =  Po 

Here  n  =  Oq  denotes  the  initial  time  (which  can  be  adopted  as  0 
since  the  system  is  stationary)  .  Also,  )^(n)  is  the  N-dimensional 
state  of  the  system  with  i'(no)  a  Gaussian  random  vector;  U,(n)  is  the 
J-dimensional,  zero-mean,  stationary,  Gaussian,  white  input  noise 
process;  and  w(n)  is  the  J-dimensional,  zero-mean,  stationary, 
Gaussian,  white  measurement  noise  process.  The  output  (or 

measurement)  process  {A(n)}  is  also  a  J-dimensional  vector  process. 
Matrix  F  is  the  NxN  system  matrix,  G  is  NxJ  input  noise 

distribution  matrix,  is  the  JxN  output  distribution  matrix, 
is  the  JxJ  output  noise  distribution  matrix,  and  Pq  is  the 
correlation  matrix  of  the  initial  state.  All  these  matrices  are 
time-invariant.  Matrix  Pq  is  Hermitian  (P©*^  =  Po^  and  all  its 

eigenvalues  are  real-valued)  arid  positive  definite  (all  its 

eigenvalues  are  positive) . 
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System  (2-2)  is  assumed  to  be  asymptotically  stable,  which 
means  that  all  the  eigenvalues  of  matrix  F  are  inside  the  unit 
circle.  Also,  system  (2-2)  is  assumed  to  be  reachable  and 
observable,  which  implies  that  the  dimension  N  of  the  state  vector 
(also  the  order  of  the  system)  is  minimal  (Anderson  and  Moore, 
1979)  .  That  is,  there  is  no  system  of  lesser  order  which  has 
identical  input/output  behaviour.  Lastly,  system  (2-2)  is  assumed 
to  be  minimum-phase  (its  zeros  are  also  inside  the  unit  circle). 
This  last  assumption  implies  that  the  system  is  defined  uniquely 
by  second-order  statistics.  The  output  distribution  matrices  are 
defined  with  the  conjugate  operator  in  order  to  have  notation 
consistent  with  that  of  the  single-output  system  case,  where  both 
H  and  D  become  vectors,  and  nominally  vectors  are  defined  as 
column  vectors. 

The  input  noise  process  correlation  matrix  is  given  as  (all 
matrices  defined  hereafter  have  appropriate  dimensions) 

(2-3a)  E[u(k)u^(k)]  =  Ruu{0)  =  Q  k  >  Oq 

( 2  -  3b )  E[u(k)u^(k-n)]  =  Ruu(n)  =  [0]  k  >  no  and  n  0 

and  the  output  noise  process  correlation  matrix  is  given  as 

( 2  -  4  a )  E[:a(k)^^{k)]  =  Rwvy(O)  =  C  k  >  no 

(2-4b)  E[w(k)w^(k-n)]  =  Rww(n)  =  [0]  k>no  and  n  0 

Notice  that  matrices  Q  and  C  are  Hermitian.  Matrix  Q  is  at  least 
a  positive  semidefinite  matrix  since  it  is  an  auto-correlation 
matrix  (all  the  eigenvalues  of  a  positive  semidefinite  matrix  are 
non-negative) ,  and  matrix  C  is  assumed  to  be  positive  definite 
(this  can  be  relaxed  to  positive  semi-definite,  but  positive 


definiteness  i  =5  more  realistic  since  in  the  radar  problem  3^(n) 
represents  channel  noise  and  other  such  noise  processes  which  are 
independent  from  channel  to  channel) . 

In  the  most  general  form  for  this  model  the  input  and  output 
noise  processes  are  correlated,  with  a  cross-correlation  matrix 
defined  as 

( 2  -  5  a )  E[u(k)^(k)]  =  Ruv»(0)  =  S  k  >  Hq 

( 2  -  5b )  E[u{k)^(k-n)]  =  Ruw(n)  =  [0]  k  >  Pq  and  n  ^  0 

In  general,  matrix  S  is  not  Hermit ian.  Both  the  input  and  output 
noise  processes  are  uncorrelated  with  the  present  and  past  values 
of  the  state  process,  and  this  is  expressed  in  terms  of  cross¬ 
correlation  matrices  as 

(2-6a)  E[jt(k)u«(k-n)]  =  R,„(n)  =  10]  k  >  Pq  and  P  ^  0 

{ 2-6b)  E[y{k)^^(k-p)]  =  Ryw(n)  =  [0]  k  >  Pq  and  P  >  0 

The  correlation  matrix  of  the  state  is  defined  as 

(2-7)  E[y(p)y^(p)]  =  Ryy(p)  =  P{n)  k  >  Po  and  P  >  0 

It  follows  from  (2-2a)  and  the  above  definitions  that  the  state 
correlation  matrix  satisfies  the  following  recurrence  relation, 

(2-8)  P(p+1)  =  FP(p)F^  +  GQG^  p>no 

In  general,  matrix  P(p)  is  Hermitian  and  positive  definite.  Since 
system  (2-2)  is  stationary  and  asymptotically  stable,  and  since 


matrix  Q  is  positive  definite,  then  the  following  steady-state 
(large  n)  value  exists  for  the  recursion  (2-8) : 


(2-9)  P(n+1)  =  P(n)  =  P 

Under  steady-state  conditions  Equation  (2-8)  becomes  a  Lyapunov 
equation  for  the  steady-state  correlation  matrix, 

(2-10)  P  =  FPF^  +  GQG^ 

The  conditions  for  steady-state  also  insure  that  the  solution  to 
Equation  (2-10)  exists,  is  unique  (for  the  selected  state  space 
basis),  and  is  positive  definite  (Anderson  and  Moore,  1979)  . 
Matrix  P  is  unique  for  a  given  state  space  basis.  However,  if  the 
basis  of  the  input  noise  vector  and/or  the  basis  of  the  state 
vector  are  changed  by  a  similarity  and/or  an  input  transformation, 
then  a  different  state  correlation  matrix  results  from  Equation 
(2-10)  . 

The  correlation  matrix  sequence  of  the  output  process  {)l(n)}  is 
defined  as 

(2-iia)  EIx(k)x>-n)l  =  R„(n)  =  A„  V  k  and  n  >  0 

(2-1  lb)  Rxx(-n)  =  Rxx(n)  Vn 

For  system  (2-2)  the  correlation  matrix  Rxx(n)  can  be  expressed  in 
factored  form,  with  the  system  parameter  matrices  as  factors: 

(2-12a)  An  =  Rxx(n)  =  H^F^'Y  n>0 

(2-l2b)  An  =  Rxx(n)  =  r^[F"-^]^H  =  r”[FT‘^H 
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n  <  0 


Here  denotes  F  raised  to  the  (n-1)th  power,  and  T  denotes  the 

following  cross-correlation  matrix 

(2-13)  r=E[y(n)xV-1)]  =  Ryx(1)  =  FP(n)H  +  GSD  Vn>0 

The  correlation  matrix  sequence  factorization  in  Equation  (2-12) 
is  the  )cey  to  most  correlation-based  stochastic  realization 
algorithms.  The  zero-lag  (O^O)  output  correlation  matrix  is 

(2-14)  Rxx(O)  =  H^P(n)H  +  D^CD  =  Ao 

and  matrix  Rxx(O)  is  Hermitian  and  at  least  positive  semidef  inite . 
In  steady-state,  P  replaces  P(n)  in  Equations  (2-13)  and  (2-14) . 

As  can  be  inferred  from  the  above  relations,  the  system 
parameters  {F.  G.  H,  D.  Q.  C.  S.  P.  n  completely  define  the  second-order 

statistics  (the  correlation  matrix  sequence  {Rxx(h)})  of  the  output 
process,  and  it  is  said  that  system  (2-2)  realizes  the  output 
correlation  matrix  sequence.  Conversely,  the  second-order 
statistics  of  the  output  process  provide  sufficient  information  to 
identify  the  system  parameters,  although  not  uniquely.  Since  the 
output  process  has  mean  equal  to  zero  and  is  Gaussian-distributed, 
the  second-order  statistics  define  the  process  completely. 

From  the  system  identification  (stochastic  realization)  point 
of  view,  the  problem  addressed  herein  can  be  stated  as  follows: 
given  the  output  data  sequence  {x(n)}  of  system  (2-2),  estimate  a 
set  of  system  parameters  {F.  G.  H.  D,  Q.  C.  S.  P,  n  which  generates  the 

same  output  correlation  matrix  sequence  as  system  (2-2)  . 
Furthermore,  the  identified  parameter  set  must  correspond  to  a 
system  realization  of  minimal  order  (with  state  vector  y  of  minimal 
dimension) .  The  solution  to  this  problem  is  pursued  herein  via  a 
two-step  approach:  first  an  estimate  of  the  output  correlati.on 
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matrix  sequence  is  calculated,  and  then  the  estimated  correlation 
sequence  is  used  to  determine  the  system  parameters. 


It  is  well  known  (Anderson  and  Moore,  1979)  that  there  can  be 
an  infinity  of  systems  (2-2)  with  the  same  output  correlation 
matrix  sequence.  The  set  of  all  systems  that  have  the  same  output 
correlation  matrix  sequence  is  an  equivalence  class,  and  any  two 
systems  belonging  to  the  set  are  said  to  be  correlation  equivalent 
(Candy,  1976).  For  example,  the  output  correlation  matrix 
sequence  remains  invariant  to  a  si.milarity  transformation  applied 
to  the  state  vector.  Similarly,  the  output  correlation  matrix 
sequence  remains  invariant  also  to  a  non-singular  t rans format: ion 
applied  to  the  input  noise  and/or  to  the  output  noise.  As  shown 
by  Candy  (1976),  the  equivalence  class  of  correlation  equivalent 
systems  is  defined  including  other  operations  besides  a  change  of 
basis . 

As  inferred  from  these  comments,  the  solution  to  t.he  syste.m 
identification  problem  is  not  unique.  It  is  also  true  that  .most 
of  the  possible  system  parameter  solutions  do  not  possess 
desirable  properties.  There  is,  however,  a  solution  which  has 
several  features  of  importance.  This  solution  is  referred  to  as 
the  innovations  representation  for  system  (2-2),  and  is  discussed 
in  Section  2.3.  The  identification  algorithm  of  Section  3.0 
generates  system  parameter  matri.x  estimates  for  the  innovations 
representation . 

In  general,  the  system  matrix  parameters  resulting  from  the 
identification  algorithm  will  be  represented  in  a  different  basis, 
and  should  be  denoted  with  a  different  symbol  (say,  F.|  instead  of 

F,  etc.);  nevertheless,  the  same  symbol  will  be  used  in  this 
report  in  order  to  simplify  notation. 


Several  definitions  and  notation  associated  with  the  input 
/'output  behaviour  of  system  (2-2)  are  important.  Consider  first 
the  L-term  (finite)  controilabiliLv  matrix  of  system  (2-2),  C\_; 

this  matrix  is  defined  as  an  NxJL  partitioned  matrix  of  the  form 
(2-15)  4  =  [G  FG  •••  F^'^G  ] 

As  is  well-known,  for  a  minimal-order  system  matrix  Ci  has  rank  N 
for  L  >  N .  The  controllability  matrix  maps  the  input  space  onto 
the  -State  space.  Analogously,  the  L-term  observability  matrix  of 
system  (2-2)  is  the  following  JLxN  partitioned  matrix, 

LjHp 

(2-16)  Ol  = 

and  for  a  minimal-order  system  the  rank  of  matrix  0i  is  equal  to  N 
for  L>N.  The  observability  matrix  maps  the  state  space  onto  the 
output  space.  Classical  realization  theory  for  the  deterministic 
case  (see  Appendix  B)  is  based  on  the  fact  that  a  deterministic 
system  block  Kankel  matrix  can  be  represented  as  the  product  of 
the  observability  and  controllability  matrices.  Let  Hl  (_  denote 
the  JLxJL  deterministic  block  Hankel  matrix  with  the  impulse 
response  matrix  A(i+j-1)  as  its  (i,j)th  block  element  (a  block  Hankel 
matrix  is  a  matrix  in  which  the  (i,j)th  block  element  is  a 
function  of  i+j) .  That  is. 


A(1)  A(2)  •••  A(L) 

A(2)  A{3)  ...  A{L+1) 

(2-17)  Hll  =  = 

A(L)  A(L+1)  •••  A(2L*1) 

The  form  of  Equation  (2-17)  follows  from  the  definition  of  the 

impulse  response  matrix  sequence  {A(n)}  for  a  deterministic  system, 

(2-18)  A(n)  =  H^F"'‘'G  n>1 

As  shown  in  Appendix  B,  for  L>N  the  rank  of  the  deterministic 
block  Hankel  matrix,  Hn_,  is  equal  to  the  system  order,  N.  In 
fact,  it  is  true  also  that  rank(Hjy^^l(  =  N  for  k>1,  and  that  the 
elements  of  the  impulse  response  matrix  sequence  {A{n)}  satisfy  a 

set  of  recursion  relations  (Equation  (B-7))  of  order  equal  to  the 
minimal  polynomial  of  matrix  F.  The  block  columns  (rows)  of  H|__l 

satisfy  the  same  recursion  relations  due  to  the  sequential 
arrangement  of  the  matrices  {A(n)}  as  block  elements  of 

Notice  that  the  representation  (2-18)  of  the  impulse  response 
matrix  sequence  is  of  the  same  form  as  the  representation  of  the 
correlation  matrix  sequence  in  Equation  (2-12) .  Due  to  this 
similarity  the  matrix  elements  of  the  correlation  matrix  sequence 
{A^}  satisfy  the  same  set  of  recursion  relations  as  the  matrix 
elements  of  the  impulse  response  matrix  sequence  {A(n)},  and  the 
above-discussed  properties  of  the  deterministic  Hankel  matrix  are 
also  properties  of  the  stochastic  Hankel  matrix. 

Associated  with  system  (2-2)  is  a  backward  time  model  which 
is  defined  from  the  system  model  (2-2) .  Backward  time  models  play 
a  role  in  the  formulation  of  a  large  class  of  stochastic 
realization  algorithms.  The  backward  time  model  for  system  (2-2) 
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is  defined  as  a  discrete-time,  stationary,  complex-valued,  zero- 
mean,  Gaussian  random  process  with  a  state  space  representation  of 
the  form  (Faurre,  1976) 


(2-1 9a)  S(n)  =  F^5(n+1) +yj(n) 

(2-1 9b)  >((n)  =  r^S(n)  +  \^(n) 

where  5.(n)  is  the  N-dimensional  state  vector,  Vj(n)  is  the  N- 
dimensional  input  noise  vector,  and  ^(n)  is  the  J-dimens ional 
output  noise  vector.  Both  noise  vectors  are  uncorrelated  in  time 
(white),  have  mean  equal  to  zero,  and  are  Gaussian-dist ributed . 
It  is  important  to  note  that  matrix  F  in  Equation  (2-19b)  is  the 

same  matrix  which  appears  in  the  factorization  of  the  output 
correlation  matrices  in  Equation  (2-12),  and  is  defined  in 
Equation  (2-13)  . 

The  L-term  observability  matrix  for  the  bac)cward  system  (2- 
19)  is  the  following  JLxN  partitioned  matrix, 

r^pH 

The  baclcward  system  is  completely  observable  also,  which  implies 
that  ran)c((Z7|_)  =  N .  Also  of  interest  is  the  conjugate  transpose  of 
(Z\,  which  is, 

(2-21)  !Df^  =  [r  FT  F^'Y  ] 


(2-20)  Dl  = 
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Matrix  is  a  controllability  macrix  for  the  matrix  pair  (F.  n, 

and  as  such,  it  can  be  viewed  as  the  controllability  matrix  for 
the  dual  system  corresponding  to  the  backward  model  (if  system  B 
is  a  dual  for  system  A,  then  the  input  of  system  3  is  the  output 
of  system  A,  and  the  output  of  system  B  is  the  input  of  system  A; 
that  is,  the  roles  of  input  and  output  are  interchanged)  . 
However,  in  this  report  it  is  preferable  to  refer  to  as  the 

backward  observability  matrix. 

In  the  context  of  stochastic  realization  theory,  the 
significance  of  the  backward  model  follows  from  Equation  (2-20) 
and  the  Hankel  matrix  of  output  correlation  matrices,  as  shown 
next.  Define  a  stochastic  Hankel  matrix  ^l.L  the  following 
JLxJL  block  matrix, 

Ai  A2  •  •  •  Al 
Ag  A3  • . .  Al^^ 

Al  Al^I  •••  AgL.i 

where  the  block  elements  {An}  are  the  elements  of  the  output 
correlation  matrix  sequence.  Equation  (2-12)  .  It  follows  from 
Equations  (2-12),  (2-16),  (2-21),  and  (2-22)  that 

(2-23) 

This  equation  is  fundamental  to  stochastic  realization  theory  from 
conceptual  as  well  as  algorithmic  viewpoints.  From  a  conceptual 
viewpoint.  Equation  (2-23)  is  a  factorization  of  the  Hankel  matrix 
into  the  observability  matrices  of  the  forward  and  backward 
systems,  and  thus  hints  at  the  underlying  structure  of  the 

2  1 


correlations  between  the  past  and  future  output  vectors  (as 
discussed  below  and  in  Section  3.0). 

From  an  algorithmic  viewpoint,  the  similarities  between  the 
deterministic  and  stochastic  bloc)c  Hankel  matrices  and  their 
respective  factorizations  implies  that  the  properties  which  are 
true  for  the  deterministic  case  are  true  also  for  the  stochastic 
case.  Specifically,  the  most  important  of  these  properties  are; 

i)  rank(!^L)  =  N  for  L>N, 

ii)  rank(^^^(^ =  N  for  k>1,  and 

iii)  the  block  columns  (rows)  of  ^l,L  satisfy  the  same 
recursion  relations  as  the  block  columns  (rows)  of 

Furthermore,  the.  similarities  between  Equation  (2-17)  and 
Equations  (2-22)  and  (2-23)  allow  the  application  of  classical 
deterministic  realization  concepts,  insight,  and  algorithms  (see 
Appendix  B)  to  the  stochastic  realization  problem  formulated  with 
output  correlation  matrices. 

Other  important  matrices  in  stochastic  realization  theory 
include  the  JLxJL  "future"  and  "past"  block  correlation  matrices. 
These  matrices  are  the  correlation  matrices  of  future  and  past 
output  block  vectors  defined  as 


(2-24)  Xp  =  x(n-1;n-L)  = 


A(n-1) 

2i(n-2) 

.  i(n-L)  . 
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(2-25)  ^  =  x(n;n+L-1)  = 


With  respect  to  the  time  instant  n,  vector  Xp  represents  the  past 
of  the  process  U(n)},  and  vector  represents  the  future  of  the 
process  {x(n)} .  Given  these  definitions,  the  following  matrices  can 
be  introduced: 


A(n) 

X(n+1) 

2i(n+L-1) 


(2-26) 


^:L.L  “  ^^2^3  “ 


Aq 

A.1  Aq 


^1-L  ^2-L 


^L-2 


"0  J 


(2-27)  ^;L,L  “  = 


Aq  a., 


L  Al.i  Al.2 


M-L 


Ai  Aq  A; 


2-L 


'0  J 


where  ‘Kj=±x  and  ^.L.L  are  the  JL;'JL  future  and  past  bloclc 
correlation  mal.i-ices,  respectively.  Both  of  these  matrices  are 
Hermitian  (as  -  ell  as  bloc)c  Hermitian)  ,  and  they  exhibit  a  bloc)c 
Toeplitz  structure  (a  block  Toeplitz  matrix  is  a  matrix  in  which 
the  (i, j)th  block  element  is  a  function  of  i-j) .  It  is  important 
to  note  that,  in  general,  the  conjugate  transpose  of  l 
equal  to  %;L,L'  even  though  these  matrices  are  the  block  Hermitian 
of  each  other;  that  is,  matrices  and  ^±X  are  not  the 

element-by-element  Hermitian  of  each  other. 
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Another  matrix  of  interest  is  the  block  cross-correlation 
matrix  between  the  future  and  the  past,  which  is  defined  as 


(2-28) 


Ai  A2 

A2  A3 


L  \  \+1  ■■■  -^21-1  J 


Equations  (2-26) - (2-28)  are  valid  for  all  n  because  the  process 
{2i{n)}  is  stationary.  The  stochastic  realization  approach  of  Akaike 
(1974,  1975)  is  based  on  these  block  correlation  matrices. 

For  L >  N,  Equations  (2-26) - (2-28)  define  the  correlation 

structure  of  syftem  (2-2).  As  indicated  in  Equation  (2-28),  the 
block  cross-correlation  matrix  equal  to  the  stochastic 

block  Hankei  matrix.  Equation  (2-22) .  Thus,  as  hinted  earlier, 
the  cross-correlation  between  the  past  and  future  outputs  admits  a 
factorization  in  terms  of  the  forward  system  and  backward  system 
observability  matrices. 

2 . 3  Innovations  Representation 

The  innovations  representation  is  a  very  powerful  concept  in 
the  theory  of  linear  stochastic  systems  due  to  its  simplicity  and 
its  characteristics.  Several  texts  and  papers  discuss  this 
concept  in  detail.  The  discussion  herein  is  adapted  mostly  from 
Anderson  and  Moore  (1979),  which  provide  a  lucid  presentation. 

The  innovations  representation  for  a  system  (2-2)  is  a 
discrete-time,  stationary,  complex-valued,  system  of  the  form 

(2-29a)  (2t(n+1)  =  Fct(n)  +  Ke(n)  n  >  Hq 
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(2-29b) 

X(n)  =  H”a(n)  +  £(n) 

n  >  ho 

(2-29C) 

a(no)  =  On 

(2-29d) 

E[a(no)a”(no)]  =  n(n„)  =  Ho  =  [0] 

(2-29e) 

E[a(n)a”(n)]  =  n(n) 

n  >  ho 

(2-29f) 

n(n)  =  n  as  n  — » oo 

(2-29g) 

=  f^xx(n) 

V  n 

here  2t(n) 

is  the  N-dimensional  state. 

X(n)  is  the  J-dimensional 

output,  and  the  input  process  {£(n)}  is  the  innovations  process  for 
system  (2-2)  .  That  is,  {£(n)}  is  a  J-dimensional ,  zero-mean,  white 
Gaussian  process  with  correlation  matrix  structure  given  as 

(2-30a)  0.  =  E[£(k)£^(k)]  =  R,x(0)*H^nH  =  Ao  -  k  >  ho 

( 2  -  3  Ob )  E[£(k)£‘^(k-n)]  =  [0]  k  >  Hq  and  n  ^  0 

The  state  correlation  matrix  n(n)  has  a  steady-state  value  because 

the  system  is  asymptotically  stable  (stationary),  and  the  steady- 
state  value,  n,  is  obtained  as  the  limiting  solution  to  the 

following  recursion 

(2-3ia)  n(n+1)  =  Fn(n)F^  +  [Fn(n)H-r]lAo-H^n(n)H]-'' [Fn(n)H-r]^  n>no 

(2-3ib)  n(no)  =  no  =  [0] 

Matrix  K  in  Equation  (2-29a)  is  given  as 
(2-32a)  K  =  [r-FnH]  Q-1  =  [r  -  FRH]  [Ao  - 
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(2-32b)  K  =  GSDQ-^  =GSD[Ao-H^nH]-’ 


where  the  second  relation  follows  from  the  definition  of  F  in 
Equation  (2-13)  and  of  Q  in  Equation  (2-30a)  .  In  the  cases  where 
the  inverse  of  the  correlation  matrix  Q  does  not  exist,  its 
pseudoinverse  is  used  instead  in  Equations  (2-31)  and  (2-32) . 

Matrices  F,  H,  Ao,  and  r  are  as  defined  for  system  (2-2) . 
That  is,  system  (2-29)  is  related  to  system  (2-2)  .  In  fact, 
system  (2-29)  as  defined  above  is  the  steady-state  innovations 
representation  for  system  (2-2)  .  This  representation  has  the 
following  important  features. 

(a)  First  and  foremost,  the  correlation  matrix  sequence  of 
{z(^)}  equal  to  the  correlation  matrix  sequence  of 

-as  indicated  in  Equation  (2-29g)  .  That  is,  the 
processes  {x(n)}  and  {x(n)}  are  correlation  equivalent. 

This  means  that  the  innovations  representation  is  a 
valid  solution  to  the  system  identification  problem 
defined  herein. 

(b)  Of  all  the  correlation  equivalent  representations  for 
a  given  output  correlation  sequence,  the  innovations 
representation  has  the  smallest  state  correlation 
matrix,  R  (smallest  is  meant  in  the  sense  of  positive 
definiteness;  that  is,  R-i  is  smaller  than  n2  if  02  - 
Ri  is  a  positive  definite  matrix)  .  This  property  of 
the  innovations  model  is  significant  because  the  state 
correlation  matrix  is  a  measure  of  the  uncertainty  in 
the  state  . 


26 


(c)  The  innovations  representation  is  directly  related  to 
the  steady-state  Kalman  filter  (in  the  one-step 
predictor  formulation)  for  system  (2-2).  In  fact,  the 
steady-state  Kalman  filter  for  system  (2-2)  is 
available  immediately  upon  definition  of  the  steady- 
state  innovations  representation,  and  viceversa. 
Specifically,  matrix  K  of  Equations  (2-29a)  and  (2- 
31)  is  the  steady-state  Kalman  gain  of  the  optimal 
one-step  predictor  for  system  (2-2)  .  This  is  true 
provided  that  the  eigenvalues  of  F-KH^  are  stable. 
Thus,  the  innovations  model  is  defined  as  above  for 
all  processes  of  the  form  (2-2),  but  the  steady-state 
Kalman  filter  is  defined  only  if  F  -  KH^  is  stable. 

(d)  The  process  {£(n)}  in  Equations  (2-29)  and  (2-30)  is 
correlation  equivalent  to  the  innovations  sequence  of 
system  (2-2)  .  This  is  the  reason  for  referring  to 
system  (2-29)  as  the  "innovations  representation"  for 
system  (2-2)  , 

(e)  The  innovations  model  (2-29)  is  causally  invertible. 

This  means  that  the  present  and  past  of  the  process 
{£(n)}  can  be  constructed  from  the  present  and  past 
values  of  the  output  process  {X(n)}.  The  converse 

statement  is  true  also;  that  is,  any  causally 
invertible  model  is  an  innovations  representation  for 
some  system.  Causal  invert ibility  of  system  (2-29) 
can  be  demonstrated  easily.  From  Equation  (2-29b), 

(2-33)  £(n)  =  -  H^2t(n)  +  X(n) 

Substituting  this  expression  for  £(n)  into  Equation  (2- 
29a)  results  in 
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(2-34) 


a(n+1 )  =  (F  -  KH^lii(n)  +  Kx{n) 


These  relations  demonstrate  the  causal  invertibility 
of  the  innovations  model  (the  input  and  output 
variables  have  traded  places)  . 

(f)  Matrix  F-KH"  in  the  inverted  innovations  model  is  a 
stable  matrix.  This  follows  from  the  act  that  the 
matrix  pair  (F,  H)  is  observable,  and  implies  that  the 
Kalman  filter  for  system  (2-2)  is  stable  also. 

(g)  The  transfer  function  of  the  innovations  model  (2-29) 
is  minimum  phase.  This  is  related  to  the  fact  that 
the  innovations  model  is  correlation  equivalent  to 
system  (2-2),  and  second-order  moment  information  (the 
output  correlation  matrix  sequence)  does  not  contain 
any  phase  information. 

(h)  The  innovations  representation  for  a  system  of  the 
form  (2-2)  is  unique.  Given  that  the  innovations 
representation  has  the  same  output  covariance  sequence 
as  system  (2-2)  ,  the  fact  that  it  is  unique  eliminates 
searching  for  other  represe.ntations  for  system  (2-2) 
with  the  properties  listed  herein. 

(i)  The  innovations  model  (2-29)  can  be  computed  from  the 

output  correlation  matrix  sequence  of  system  (2-2) . 
This  fact  simplifies  the  parameter  identification 
problem  because  the  set  of  parameter  matrices  that 
must  be  estimated  from  the  data  is  reduced  to  just 
five:  {F,  H,  F,  H,  Aq  )  (given  these  parameter 

matrices,  the  innovations  covariance,  Q,  and  the 
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Kalman  gain,  K,  are  obtained  using  Equations  (2-30a) 
and  (2-32a) ,  respectively) . 

All  the  features  listed  above  are  of  relevance  to  the 
identification  approach  presented  in  Section  3.0  because  the 
selected  parameter  identification  algorithm  generates  the 
innovations  representation  for  the  given  output  correlation  matrix 
sequence,  following  feature  (i). 

The  baclcward  model  has  an  associated  backward  innovations 
model  with  parameter  matrices  F,  T,  and  the  backward  Kalman  gain. 

Most  of  the  features  (a)-(i)  that  describe  the  forward  innovations 
model  are  valid  also  for  the  backward  innovations  model,  with  a 
notable  exception  of  feature  (b)  ,  which  needs  to  be  replaced  by 
the  following  statement:  For  each  valid  correlation  equivalent 
representation  for  a  given  output  correlation  sequence,  the  state 
correlation  matrix  is  smaller  than  the  inverse  of  the  state 
correlation  matrix  for  the  backward  innovations  model.  More 
specifically,  let  denote  the  state  correlation  matrix  for  the 
backward  innovations  model  in  steady-state  conditions,  and  let  L 
denote  the  state  correlation  matrix  for  any  valid  correlation 
equivalent  representation  of  an  output  correlation  sequence. 
Then,  -  Z  is  a  positive  definite  matrix.  This  result  provides 

an  upper  bound  for  the  state  correlation  matri.x  of  a  correlation 
equivalent  representation.  Combining  this  with  the  lower  bound  of 
property  (b)  of  the  forward  i.nnovations  model  gives 

(2-35)  n  <  s  < 

As  before,  inequality  between  two  matrices  is  intended  in  the 
sense  of  positive  semi-definiteness  of  the  matrix  difference. 
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3.0  MULTICHANNEL  SYSTEM  IDENTIFICATION 


Identification  of  the  model  parameter  matrices  (F,  H,  F,  FI,  Aq} 
for  the  innovations  representation  is  carried  out  based  on  t.he 
predictor  space  concept  and  the  canonical  correlations  methodology 
formulated  by  Akaike  {1974;  1975),  and  using  the  specific 

algorithmic  development  of  Detai  et  al.  (1985),  extended  to  the 
case  of  complex-valued  data.  Their  results,  in  turn,  are  built 
upon  the  correlation  equivalence  results  obtained  by  Faurre 
(1976),  and  the  deterministic  realization  theory  and  algorithm  of 
Ho  (Kalman  et  ai.,  1969).  The  identification  algorithm  requires 
the  output  correlation  matrix  sequence;  since  the  true  output 
correlation  sequence  is  not  available,  an  estimate  must  be 
obtained . 

3 . 1  Covariance  Sequence  Estimation 

The  first  step  in  the  modeling/ identification  procedure  is 
the  estimation  of  the  output  correlation  matrix  sequence  {R  XX  (n)}  = 
{An}  for  n>0  (for  notational  simplicity,  Dq  =  0  is  assumed  in  this 

section)  given  a  finite-length  realization  of  the  output  process, 
{2L(n)  ln  =  0,1,...,Nj-1}.  There  are  two  nominal  estimators  for 

correlation  matrices.  The  first  estimator  is  of  the  form 

-  ^  Nr-1 

(3-1)  An  =  Rxx{n)  =  — J —  X  n<NT-  1 

Estimator  (3-1)  provides  an  unbiased  estimate  of  the  output 
correlation  matrix  sequence  (that  is,  the  expected  value  of  (3-1) 
is  equal  to  the  true  correlation  matrix  sequence) ,  but  there  have 
been  cases  where  the  use  of  this  estimator  has  led  to 
com.putat  iona  1  difficulties.  In  particular,  sometimes  when 
estimator  (3-1)  is  used  to  form  a  Toeplitz  block  correlation 
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matrix,  the  Toeplitz  matrix  is  not  positive  definite  (or  at  least 
positive  semi-definite)  as  it  should  be. 

The  second  estimator  is  of  the  form 

h4T-1 

(3-2)  An  =  Rxx(n)  =  -Tj-  X  n<NT- 1 

*'*1  k-O 

with  zeros  used  in  the  place  of  missing  data  x(*2),  .  .  . 

Estimator  (3-2)  provides  a  biased  estimate  as  a  result  of  padding 
the  data  record  with  leading  zeros.  However,  division  by  Nj  in 

(3-2)  for  all  lags  drives  the  correlation  estimates  to  exhibit  an 
enhanced  monotonically  decreasing  behaviour  as  a  function  of  n 
(the  enhancement  is  with  respect  to  the  correlation  estimates 
resulting  from  Equation  (3-1)).  Such  a  feature  is  desirable 
because  the  output  correlation  sequence  of  a  stationary  system 
(with  matrix  F  stable)  is  monotonically  decreasing.  This  feature 
of  estimator  (3-2)  has  provided  improved  performance  (in  relation 
to  estimator  (3-1))  in  algorithms  such  as  the  scalar  Yule-Walker 
method  for  spectrum  estimation  by  insuring  that  the  Toeplitz 
correlation  matrix  which  arises  in  that  problem  be  at  least 
positive  semidef inite .  It  is  possible  that  this  feature  of 
estimator  (3-2)  be  of  similar  relevance  with  the  Hankel  matrix  and 
the  Toeplitz  matrices  that  arise  in  the  stochastic  realization 
problem  considered  in  Section  3.0.  For  large  values  of  Nj  and 

small  values  of  the  maximum  lag  considered,  Nmax/  estimator  (3-2) 

approximates  the  behaviour  of  estimator  (3-1),  and  any  differences 
become  insignificant.  However,  for  small  values  of  Nj  and/or  for 
values  of  Nmax  close  to  Nj,  each  estimator  may  offer  specific 
advantages  in  the  context  of  distinct  problems.  Which  estimator 
is  preferable  in  the  context  of  the  problem  of  interest  herein  is 
a  topic  for  future  investigation. 


3 . 2  Canonical _ Correiationa _ Alaoritbn 

In  comparison  with  other  alternative  stochastic  realization 
techniques,  the  canonical  correlations  algorithm  used  herein  has 
several  advantages  for  m\i  It  ichanne  1  detection  applications,  as 
listed  next  (Desai  et  al.,  1985)  . 

•  Identifies  the  parameters  for  a  model  in  the  state-space 
class,  which  is  more  general  than  the  time  series  class. 

•  An  approximately  balanced  (in  the  stochastic  sense) 
state  space  realization  is  generated,  thus  providing  a 
built-in  and  robust  mechanism  for  model  order  selection. 

•  Identifies  the  innovations  representation  of  the  system 
and  generates  the  state  correlation  matrix  and  the 
Kalman  gain  directly.  Thus,  the  Kalman  filter  is 
obtained  without  having  to  solve  a  nonlinear  discrete 
matrix  Riccati  equation. 

•  Implementation  of  the  algorithm  involves  the  singular 
value  decomposition  (SVD) ,  which  is  a  stable  numerical 
method . 

These  features  offer  enhanced  model-baseJ  detection  performance  in 
relation  to  algorithms  such  as  those  based  on  time  series  models. 
A  discussion  of  the  canonical  correlations  algorithm  is  provided 
in  the  remainder  of  this  section.  This  discussion  complements  and 
extends  the  material  presented  in  Appendices  B  and  C,  as  it  is 
applied  to  the  stochastic  realization  problem. 

The  canonical  correlations  identification  algorithm  is  based 
on  the  concept  of  the  correlation  structure  between  the  past  and 
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future  of  the  output  process  {iL(n)},  in  the  context  of  Hilbert 
spaces  of  random  variables  (Akaike,  1974,  1975;  Faurre,  1976)  . 
Consider  the  stochastic  process  {2L(ri)}  and  define  infinite¬ 
dimensional  block  vectors  and  Xj  as 

X(n-1) 
x(n-2) 

X(n-3) 

x(n) 

2i(n+1) 
x(n42) 


These  vectors'  represent  the  past  and  future  of  the  process  with 
respect  to  time  n,  as  in  the  case  of  the  finite-length  vectors  in 
Equations  (2-24)  and  (2-25)  .  Note  that  the  time  n  can  be  any 
instant  of  time  because  the  process  is  stationary. 

Vector  Xj,  indexed  at  time  n  (as  defined  in  Equation  (3-4)) 

spans  a  vector  space  denoted  as  ^(n),  which  represents  the  set  of 
all  possible  linear  combinations  of  the  elements  of  X^,  and  is 

referred  to  as  the  "past"  of  the  process  Wn)}.  Analogously,  vector 
Xj  indexed  at  time  n  spans  a  vector  space  denoted  as  X‘^(n), 

representing  the  set  of  all  possible  linear  combinations  of  the 
elements  of  and  is  referred  to  as  the  "future"  of  the  process 

{i{n)} .  The  time  index  is  relevant  in  the  stochastic  realization 
problem  considered  in  this  section  because  the  process  Wn)}  is  a 
dynamic  time  series.  In  contrast.  Appendix  C  presents  the 
canonical  correlations  formulation  for  the  static  multivariate 


case . 


Now  let  ^(n)  be  the  space  generated  by  the  orthogonal 
projection  of  X^(n)  onto  A“(n);  that  is, 

(3-5)  ;?(n)  =  X^(n)|A“(n) 

^(n)  is  referred  to  as  the  state  space  of  the  process  {>((0)}  because 

it  is  spanned  by  the  state  of  the  innovations  representation 
(Equations  (2-29)).  For  a  system  of  the  form  (2-2),  is 

finite-dimensional,  with  dimension  equal  to  N.  The  space  ^^(n)  can 
be  represented  as  the  direct  sum  of  two  orthogonal  subspaces, 

(3-6)  A^(n)  =  X^(n)|r-(n)  0  Hn)  =  j?(n)  0  2:(n) 

where  J?(n)  J.  !£(n),  and  (E(n)  is  the  space  spanned  by  the  innovations 
process,  {£(n)} .  Equation  (3-6)  defines  the  geometric  structure  of 
the  space  This  structure  is  true  for  all  n  because  the 

process  is  stationary. 

Since  all  the  random  variables  involved  are  zero-mean  and 
Gaussian-distributed,  an  orthogonal  projection  in  these  vector 
spaces  is  equivalent  to  conditional  expectation  (Faurre,  1976). 
Specifically,  j^(n)  is  the  space  spanned  by  the  elements  of 

(3— 7 )  =  E[  x^l  {E[2i5> 21311 1  ~  ^ 2^ 

Here  the  caret  ('')  denotes  the  conditional  expectation  (which  is 

also  an  optimal  estimate  given  the  underlying  conditions);  also, 
O,  'D ,  and  ^  are  the  inf inite-dimensional  versions  of 

Equations  (2-22),  (2-16),  (2-20),  and  (2-26),  respectively.  Now 
the  algebraic  representation  of  the  geometric  expression  (3-6)  is 
obtained  as 

(3-8)  2S^  =  +  £n  =  E[2i^|2i^  +  £n 
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where  is  an  infinite-dimensional  block,  vector  having  the 

innovations  sequence  as  block  elements, 


(3-9) 


Sn  = 


£(n) 

£(n+1) 

£(n+2) 


Of  particular  interest  is  the  output  vector  at  time  n,  which  is 
the  first  block  element  of  Equation  (3-8) , 

(3-iOa)  j<;(n)  =  £(n|n-1)  +  e(n)  =  E[^(n)|xJ  +  £(n) 

(3-iOb)  x(n)  =  ^  +  £(n)  =  e[  x(n)  x^]  2^^  +  £(n) 

where  2(n|n-1)  denotes  the  minimum  variance  estimate  of  the  output 
process  at  time  n  based  on  output  measurements  up  to  time  n-1  . 
This  last  expression  is  suggestive  of  the  output  equation  of  the 
innovations  representation.  Indeed,  it  does  correspond  to 
Equation  (2-29b) ,  as  shown  next. 

Let  a(n)  denote  the  following  N-dimensional  vector  (since 
matrix  has  N  rows)  , 

(3-11)  a(n)  = 

The  elements  of  a(n)  span  the  space  .^^(n).  This  is  true  because  the 
elements  of  i j  span  and  because  the  observability 

matrix  has  full  rank.  In  fact,  £L(n)  is  the  state  of  the 
innovations  representation  at  time  n.  This  provides  the  final 
piece  of  information  needed  to  complete  the  innovations 
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representation  output  equation.  Recall  that  matrix  H  occupies 
the  first  J  rows  of  the  observability  matrix.  Then,  from  Equations 
(3-7 )- (3-11) ,  it  follows  that 

(3-12)  =  Ofl(n)  +  £n 

(3-13)  4!l(n)4]  = 

(3-14)  x(n)  =  *  £(n)  =  H”a(n)  +  £(n) 

Equation  (3-12)  is  an  analytic  representation  oi  the  statement 
that  the  observability  matrix  maps  the  state  space  onto  the  output 
space.  And  Equation  (3-14)  is  the  output  equation  for  the 
innovations  representation. 

The  system  identification  (stochastic  realization)  problem 
can  be  stated  now  as  follows:  determine  the  factors  O  and  D  of  the 

stochastic  Hankel  matrix, 

(3-15)  *  O'lP 

in  the  basis  of  the  innovations  representation.  In  that  basis, 
the  state  vector  is  defined  as  in  Equation  (3-11),  and  its 
correlation  matrix  is 

(3-16)  n  =  E[ffl(n)a»]  =  iP  'D  = 

Canonical  correlations  constitute  an  effective  approach  for 
carrying  out  the  factorization  of  the  block  Hankel  matrix  in  the 

basis  of  the  innovations  representation. 

In  practical  applications,  data  is  available  for  finite  time 
and  the  formulation  presented  above  is  approximated  using  block 
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With  that  constraint,  the 


vectors  of  finite  dimension  (JL)  . 

canonical  variables  approach  presented  in  Appendix  C  can  be 
applied  directly.  Consider  the  JL-dimensional  vectors  and  of 

Equations  (2-24)  and  (2-25),  with  a  sufficiently  large  number  (L) 
of  block  elements.  With  reference  to  Appendix  C,  let  Xp  replace  i 
and  let  ip  replace  y  in  the  formulation.  More  specifically,  the 
JL  -dimensional  canonical  variables  U{n)  and  ii(n)  are  defined  by  the 
following  transformations  on  the  past  and  future  vectors, 

(3-17)  u(n)  =  Tpip 

(3-18)  fl(n)  =  Tpip 

Then,  it  is  desired  to  determine  the  canonical  variables  a(n)  and 
ii(n),  the  canonical  correlations  {pj,  and  the  JLxJL  transformation 
matrices  Tp  and  Tp  such  that  the  following  conditions  are  satisfied 

(Appendix  C) : ' 

(3-19)  E[u(n)u»(n))  .TpE[!Sp!s”]T”  =  =  'jl 

(3-20)  E[fi(n)]2»J  =TpE[!Sp}s”]T?  =  =  U 

(3-21)  E[JJ(n)u>)l  =TfE[2Sp2s”]t”  =  TpJ4  j"  = 

■  P,  0  ...  0  ■ 

o  Op,  0 

(3-22a)  =  .  .  . 

0  0  ...  PjL. 

(3-22b)  1  >  pi  >  p2  >  .  .  .  >  pjL  ^  0 
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Determination  of  the  model  order,  N,  is  desired  also  as  part  of 
the  procedure.  The  canonical  correlations  approach  is  also  well- 
suited  to  the  determination  of  model  order  because  the  model  order 
is  equal  to  the  rank  of  the  block  Hankel  matrix  Model  order 

determination  is  discussed  further  in  Section  3.3. 

Following  the  development  in  Appendix  C,  the  matrix  square 
root  is  determined  for  the  past  and  future  correlation  matrices 
using  the  SVD.  This  gives 

<3-23)  =  UpSpU?  =  (UpS;,'2u”)(UpS;.'2u^')  = 

5lpL.L  =  'JfSfUf  =  (UpSj/^U^KUpS^'^U”)  = 

Now  transform  the  past  and  future  vectors,  Xp  and  into  two  JL- 
dimensional  random  vectors  defined  as 

(3-25)  S  =  ^ 

(3-26)  Y  =  ^F 

Given  these  definitions,  it  is  easy  to  show  that 
(3-27)  E[QQ^]  =  IjL 

(3-28)  E[y;^]  =  1jl 

(3-29)  ElYfi""]  =  R^e  = 

Notice  that  the  random  vectors  Q  and  Y  ^re  correlation-normalized, 
but  their  cross-correlation  matrix,  R^,  is  not  diagonal.  Thus,  S 
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and  7  are  not  the  desired  canonical  variables.  However,  the 

correlation  coefficient  of  any  element  0j  of  S  and  any  element  Y|  of 

7  is  bounded  between  unity  and  zero  because  these  variables  are 

correlation-normalized  (Equations  (3-27)  and  (3-28)).  Therefore, 
the  diagonalization  of  the  zross -cor relat ion  matrix  must  be 

carried  out  using  only  unitary  operations  in  order  to  maintain  the 
correlation-normalized  property. 

Diagonalization  of  matrix  R.^  of  Equation  (3-29)  is  carried 
out  using  the  SVD,  which  results  in 

(3-30)  R^  =  UpApVp 


Here  Ur  and  Vp  are  unitary  JLxJL  matrices,  and  Ap  is  a  JLxJL 

diagonal  matrix  with  non-negative  elements  along  the  diagonal. 
The  diagonal  elements  of  are  bounded  by  unity  and  zero,  and  are 

arranged  in  order  of  decreasing  magnitude,  with  the  largest  at  the 
(1, 1)  location: 


(3-31a) 


0  ...  0  0 
62  •••  0  0 

0  •  •  •  5jl.^  0 

0  •••  0  5jl_ 


(3-3lb)  1  >  5^  >  52  >  .  .  .  >  5jl  ^  0 


The  transformations  which  diagonalize  the  cross-correlation  matrix 
are  identified  by  inspection  of  Equation  (3-30) , 

(3-32)  UpR^Vp  =  Ap 
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Therefore,  the  desired  canonical  variables  are  defined  by  the 
following  transformations  on  the  vectors  Q  and  7  (and  on  vectors  ip 
and  from  Equations  (3-25)  and  (3-26)), 


(3-33)  U(n)  —  Vp Q  ss  Vr  L^p  =  VpUpSp  Up  ip 
(3-34)  JJ(n)  .  =  uSu,^p'®U?  If 


and  the  desired  canonical  correlations  are  obtained  from  Equations 
(3-31)  and  (3-32)  as 

(3-35a)  E[  J3i(n)  =  UpRyeVp  =  Ap 

(3-35b)  Pi  =  Si  i  =  1.2 . JL 

Since  matrices  Up  and  Vp  are  unitary,  the  norms  of  vectors  li(n)  and 
ii(n)  are  equal  to  the  norms  of  vectors  fi  and  7,  respectively,  and 

the  auto-correlation  matrix  of  each  of  the  vectors  U(n)  and  m  is 
an  identity  matrix,  as  required.  The  transformation  matrices  Tp 
and  Tp  of  Equations  (3-17)  and  (3-18)  are  obtained  from  Equations 
(3-33)  and  (3-34)  as 

(3-36)  Tp  =  =  V^UpS^'^^U” 

(3-37)  Tp  =  =  U^UpSp'^U? 

This  completes  the  generation  of  the  canonical  correlations  and 
associated  parameters . 


Consider  Equation  (3-21)  now  that  all  the  matrices  in  that 
equation  are  known.  It  is  thus  possible  to  solve  for  the  block 
Hankel  matrix  as 

(3-38)  -^LL  * 

This  expression  can  be  factored  as  follows: 

(3-3^)  v”  J 

from  which  the  forward  and  backward  innovations  observability 
matrices  are  determined  by  inspection.  However,  a  more 
representative  expression  for  the  forward  and  backward  innovations 
observability  matrices  is  obtained  by  recognizing  that  for  an  Nth- 
oraer  system  the  last  JL  - N  canonical  correlations  are  equal  to 
zero.  That  is,  the  JLxJL  canonical  correlation  matrix  is 

partitioned  as 

rr,  to]  1  r  rr,  (0)  ■ 

101  Rr2  J  ’  L  [0!  10) 

with  the  NxN  diagonal  submatrix  as, 

5i  0  ...  0 

0  52  0 

0  0  5|y^ 

(3-4ib)  1>5,  >52>...>5n>0 
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(3-41a>  Rp.,  = 


(3-40)  R^^'  = 


and  Vq  are  partitioned  also  into 
corresponding  to  the  partitions  of 


where  is  JLxN,  Uq2  is  JLx(JL-N),  is  JLxN,  and  Vp2  is  JLx(JL- 
N).  Given  these  partitions,  the  forward  and  bacxward  innovations 
observability  matrices  are  obtained  as 

(3-44)  q_  = 

(3-45)  d”  . 

This  expression  for  allows  determination  of  the  state  of  the 
innovations  representation  via  the  finite-data  approximation  to 
Equation  (3-11) .  That  is, 

(3-46)  a(n)  =  =  Rffl  v”  =  RRTnt(n) 

where  lii(n)  denotes  the  first  N  elements  of  U(n).  Similarly,  the 
innovations  state  correlation  matrix  is  determined  via  the  finite¬ 
dimensional  approximation  to  Equation  (3-16),  or  directly  from 
Equation  (3-46) , 


The  unitary  matrices  Up 

submatrices  of  dimensions 
matrix  Rg^.  Specifically, 

(3-42)  [  Up4  Up2  ] 


(3-43) 


vS  = 


'ri 

'R2  -1 


(3-47) 


n  =  ^  =  Rp, 


Thus,  the  NxN  innovations  state  correlation  matrix  is  diagonal, 
with  its  diagonal  elements  equal  to  the  non-zero  canonical 
correlations.  As  stated  in  Section  2.3,  the  state  of  the 
innovations  representation  has  the  smallest  correlation  matrix  (in 
the  sense  of  positive  definiteness)  of  all  the  admissible 
correlation-equivalent  representations  for  system  (2-2) . 


The  system  parameter  matrices  can  be  identified  using 
Equation  (3-39)  and  the  procedure  of  Appendix  B.3.  However,  an 
approach  based  on  Equation  (3-21)  and  the  procedure  in  Appendix 
B.2  requires  less  computations,  and  is  the  approach  preferred 

herein.  The  key  to  the  approach  is  to  recognize  that  Equation  (3- 
21),  with  as  in  Equations  (3-40)  and  (3-41),  can  be  factored 

into  the  following  two  factors: 


(3-48)  ToO^.  = 


^1/2 

'ri 


(3-<9) 


Given  this  factorization,  proceed  as  follows.  First,  operate  on 
the  bloc)c  Han]cel  matrix  ^ll  only  on  the  left  with  matrix  T2  to 
obtain 


(3-50)  ^2^,1  =  (^2^1)^  = 

where  Equation  (3-48)  has  been  applied.  Now  let  Zj-  denote  the  NxJ 
upper-left-hand  submatrix  in  Equation  (3-50)  , 

(3-51)  2r  =  Rpfr 


R 


1/2 

R1 


-UL-HN 


[r  FT  F^'Y] 
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Then  matrix  F  is  obtained  as 

(3-52)  r  =  R;JfZr 


u 

An  analogous  procedure  is  followed  to  obtain  H  ,  That  is,  operate 
on  the  block  Hankel  matrix  ^\,x  only  on  the  right  with  the 
Hermitian  of  matrix  T-j  to  obtain 


(3-53)  = 


h”f 


HHpL-l 


.  ^N.JL-N  ] 


where  Equation  (3-49)  has  been  applied.  Now  let  denote  the  JxN 
upper-left-hand  submatrix  in  Equation  (3-53) ;  that  is, 

(3-54)  Zh  = 

Finally,  matrix  H  is  obtained  as 
(3-55)  =  ZHR'fjf 


To  determine  the  system  matrix,  F,  it  is  necessary  to  define  first 
a  column-shifted  (row-shifted)  block  Hankel  matrix  as 


(3-56) 


Ag 

^3 


^3 

A4 


=  q.Fif 


L  Al+1  Al^2  ■  ■  ■  AgL  J 
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The  key  to  the  determination  of  matrix  F  is  the  factorization  of 
the  column-shifted  Hankel  matrix  indicated  in  Equation  (3-56) .  It 
follows  from  Equations  (3-48),  (3-49),  and  (3-56)  that  pre- 

multiplication  of  ^.L  by  T2  and  post-multiplication  of  by  the 
H^rm'tian  of  Ti  results  in  the  following: 


;3-57a)  T25i;.,j“-(T20L)F(®fT”|  = 


q1/2 

”ri 

^JL-KN  J 


R1 


'N.JL-N 


(3-57b)  T2^.LTr = 


"r1  ^  ^R1 
^JL-N,N 


'^N.JL-N 
^JL-MJL-N  J 


'JL-KN 


'-’N.JL-N 
^JL-NIJL-N  J 


In  this  equation  the  NxN  matrix  Zf  is  defined  implicitly  as 

(3-58)  Z,=  R;,«FR;f 


The  NxN  matrix  F  is  obtained  easily  as 
(3-59)  F^RjJfZpRp^ 


This  completes  the  factorization  of  the  output  correlation  matrix 
sequence,  {An}. 


Determination 
innovations  model 
correlation  matrix 
as 


of  the  remaining  matrix  parameters  for  the 
(2-29)  is  described  next.  The  zero-lag  output 
is  estimated  directly  from  the  output  sequence 


Nt-1 

Ao  =  £  2l(k)x^(k) 


T  k«0 
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(3-60) 


where  Nj  is  the  total  number  of  output  data  vectors  (length  of  the 

output  sequence)  used  in  the  algorithm.  Output  correlation  lag 
estimation  accuracy  depends  on  this  number;  thus,  Nj  should  be 

selected  to  be  sufficiently  large.  The  innovations  correlation 
matrix  is  obtained  as  in  Equation  (2-30a) , 

(?-61)  Q=Ao-H”nH 

and  the  one-step  prediction  filter  (Kalman)  gain  follows  from 
Equation  (2-32a)  as 

(3-62)  K  =  [r-FnH]  =  [r  -  FHH]  [Ao  -  H^nH]-’ 

This  completes  the  canonical  correlations  model  parameter 
identification  algorithm. 

The  canonical  correlations  approach  leads  to  several 
alternative  solutions  to  the  system  identification  problem  based 
on  the  selection  of  the  basis  for  the  factorization  of  HK,  and  each 

alternative  solution  has  distinct  properties  and  features.  Of 
interest  herein  is  the  solution  that  corresponds  to  the  backward 
innovations  representation,  because  it  provides  additional  insight 
into  the  canonical  correlations  formulation.  The  backward 
innovations  representation  solution  is  obtained  in  a  manner 
analogous  to  the  development  completed  above. 

Consider  the  orthogonal  projection  of  X-(n)  onto  JC(n)  ( recall 

that  the  preceding  development  is  based  on  the  orthogonal 
projection  of  X'^(n)  onto  X~(n))  ,  and  let  (B(n-1)  denote  the  space 
generated  by  the  orthogonal  projection  of  Jr(n)  onto  X^(n).  That 
is, 

(3-63)  (B(n-1)  =  X-(n)  |X^(n) 


46 


®(n-1)  is  the  backward  state  space  of  the  process  {x(n)}  because  it 

is  spanned  by  the  state  of  the  backward  innovations 
representation.  'B(n-1)  is  finite-dimensional,  with  dimension  equal 
to  N .  The  space  X'(n)  can  be  represented  as  the  direct  sum  of  two 
orthogonal  subspaces, 

(3-64)  ^(n)  =  ;r(n)|.r(n)  ®  =  'B(n-l)  ®  'H^n-1) 

where  (B(n-1)  ±  'H^(n-1),  and  ‘JV(n*1)  is  the  space  spanned  by  the 
backward  innovations  process,  denoted  herein  as  im) .  As  before, 
the  geometric  structure  of  the  space  X"{n)  defined  by  Equation  (3- 
64)  is  valid  for  all  n  because  the  process  is  stationary. 

The  space  is  spanned  by  the  elements  of  the  conditional 

expectation  of  the  past  given  the  future, 

(3-65)  %p  =  E[x^|x^  =  x^  )■’ x^  = 


which  is  also  the  minimum  variance  estimate  of  the  past  for  the 
case  of  a  zero-mean,  Gaussian-distributed  process.  This  leads  to 
the  algebraic  representation  of  the  geometric  expression  (3-64), 

(3-56)  2Lp  =  Xy  +  0)^.1  = 


where  C0,i.i  is  an  i.nf  inite-dimensional  block  vector  having  the 
backward  innovations  sequence  as  block  elements;  that  is, 


M(n-1) 

m(n-2) 

5!j{n-3) 


4  7 


(3-67) 


Now  define  an  N-dimensional  vector  as 


(3-68)  ^(n-1)  = 

The  elements  of  ${n-1)  span  the  space  (B(n-1),  and  $(n-1)  is  the  state 
of  the  backward  innovations  representation  at  time  n-1  .  Using  this 
definition,  Equation  (3-66)  is  re-written  as 

(3-69)  =  (D$(n-1)  + 

and  this  equation  is  an  analytic  representation  of  the  statement 
that  the  backward  system  observability  matrix  maps  the  backward 
state  space  onto  the  output  space.  Of  particular  interest  is  the 
first  block  row  of  Equation  (3-69) .  Specifically  (recall  that 
occupies  the  first  J  rows  of  the  backward  observability  matrix) , 

(3-70a)  x(n'1)  =  r^0(n-1)  +  ^{r\A) 

(3 -7  Ob)  x(n)  =  r^0(n)  +  a)(n) 

In  Equation  (3-70b)  ,  and  in  the  remainder  of  this  section,  the 
time  argument  n-1  is  replaced  by  n  for  notational  simplicity  (this 
is  permissible  because  the  system  is  time-invariant) .  Equation 
(3-70)  is  the  output  equation  for  the  backward  innovations 
representation . 

The  finite-time  approximation  and  the  canonical  correlations 
approach  to  the  parameter  identification  problem  apply  also  to  the 
backward  formulation,  and  lead  to  a  solution  analogous  to  the 
forward  case.  In  particular,  the  state  vector  is  obtained  from 
Equations  (3-34),  (3-44),  and  (3-68)  as 
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(3-71) 


j,(n)  =  (^s:iL.LJSF=  RmfiC") 


And  the  steady-state  correlation  matrix  of  the  backward 
innovations  representation,  follows  simply  as 

(3-72)  rib  =  ^^(n)$>)]  = 

Notice  that  the  forward  and  backward  innovations  representation 
state  correlation  matrices  are  equal  to  each  other  (Equations  (3- 
47)  and  ( 3-48) )  , 

(3-73)  rijj  =  =  n 

A  system  representation  in  which  the  forward  and  backward  state 
correlation  matrices  are  both  diagonal  and  equal  to  each  other  is 
said  to  be  in  balanced  coordinates  in  the  stochastic  sense  (Desai 
et  al.,  1985).  Balanced  coordinates  allow  effective  model  order 
selection  and/or  model  order  reduction  (Moore,  1981) . 

3 . 3  tjadfil _ Order  Determination 

Model  order  determination  is  a  necessary  decision  for  any 
identification  algorithm  in  applications  where  the  true  order  of 
the  system  generating  the  channel  output  data  is  unknown,  or  where 
the  true  process  generating  the  data  may  not  be  a  member  of  the 
model  class  adopted  to  represent  the  data.  In  the  second  case  the 
model  generated  by  the  algorithm  is  a  "representation  model, "  as 
opposed  to  a  "physical  model"  (a  model  based  on  accurate  analyses 
of  the  underlying  physical  processes) .  Determination  of  the  model 
order  is  always  a  difficult  problem,  and  the  solution  is  rarely 
clear-cut.  The  canonical  correlations  identification  algorithm 


adopted  herein  does  have  several  strong  features  that  lead  to 
robust  and  straightforward  criteria  for  model  order  estimation. 
Principally,  the  algorithm  identifies  the  model  parameters  of  the 
innovations  representation  for  the  multichannel  process  in 
stochastic  balanced  coordinates. 

Model  order  selection  in  the  algorithm  is  based,  in  one  form 
or  another,  on  the  canonical  correlations  {Pj}/  which  are  the 

diagonal  values  of  matrix  R3^.  Thus,  it  is  important  to  recall 
that  the  canonical  correlations  are  real-valued,  non-negative, 

bounded  by  unity  and  zero,  and  are  arranged  along  the  diagonal  of 
matrix  in  order  of  decreasing  magnitude.  Furthermore,  the 

steady-state  correlation  matrix  of  the  state  of  th  forward  (FI) 
and  of  the  backward  innovations  models  in  balanced 

coordinates  are  diagonal,  with  the  diagonal  elements  equal  to  the 
canonical  correlations.  In  a  balanced  representation  the  position 
of  a  state  in  the  state  vector  is  indicative  of  the  importance  of 
the  contribution  of  that  state  to  the  output  correlation  sequence 
(the  first  state  is  equal  in  importance  or  more  important  than  the 
second  state;  etc.),  and  the  magnitude  of  the  corresponding 
correlation  matrix  element  is  representative  of  the  relative 
contribution  of  that  state  (Moore,  1981)  .  Thus,  a  simple  model 
order  selection  approach  is  to  identify  the  negligible  canonical 
correlations,  and  select  the  model  order  equal  to  the  number  of 
non-negligible  canonical  correlations . 

In  most  situations  involving  a  finite  amount  of  data,  all  the 
canonical  correlations  are  different  from  zero.  This  is  due  to 
the  fact  that  the  singular  value  decomposition  of  the  Hankel 
matrix  is  imperfect  for  finite  data  cases  because  the  measurement 
noise  corrupts  the  estimation  of  the  output  correlation  matrices. 
In  such  cases,  model  order  can  be  estimated  by  identifying  jump 
discontinuities  in  the  magnitude  of  the  canonical  correlatio.is, 


50 


and/or  by  identifying  the  correlations  at  which  diminishing 
returns  occur  (when  the  criterion  value  changes  by  a  negligible 
amount  after  increasing  the  number  of  states  by  one) . 


In  the  absence  of  one  or  more  jump  discontinuities,  external 
information  may  be  required,  such  as  prior  knowledge  of  the  system 
being  modeled.  Alternatively,  a  reasonable  model  order  can  be 
selected,  and  various  analyses  can  be  carried  out  to  reduce  the 
order  of  the  model  taking  advantage  of  the  features  of  a  state 
space  realization  in  balanced  coordinates. 

Model  order  can  be  determined  also  by  inspecting  the 
normalized  running  sum  of  the  canonical  correlations.  The  ith 
canonical  correlation  normalized  running  sum  is  defined  as 

t  Pk 

(3-74)  NRSi.-iji -  i  =  1,2 . JL 

IPk 

k*1 


Notice  that  the  JLth  normalized  running  sum  is  equal  to  unity. 
Notice  also  that  the  parameter  NRSj  is  the  fraction  of  the  past- 
to-future  correlations  covered  by  retaining  the  ith  largest 
canonical  correlations. 

Other  criteria  can  be  applied  for  model  order  determination. 
Squaring  the  canonical  correlations  emphasizes  discontinuities, 
and  thus  provides  a  good  criterion.  The  normalized  running  sum  of 


(3-75) 


NRSSi  = 


i  =  1.2 . JL 


i 


is  Still  another  useful  criterion.  These  last  two  criteria  are 
heuristic,  since  there  is  no  significance  to  the  square  value  of  a 
correlation  coefficient,  nor  to  its  normalized  running  sum. 
However,  these  two  criteria  generally  perform  better  model  order 
determination  than  the  canonical  correlations  and  their  running 
sums  . 


The  mutual  information  between  the  past  and  future  vectors  is 
the  basis  for  the  definition  of  two  other  model  order 
determination  criteria.  Mutual  information  does  have  statistical 
significance,  and  generally  provides  effective  model  order 
determination.  Consider  first  a  set  of  variables  {Kj}  defined  as 

the  following  nonlinear  function  of  the  canonical  correlations: 
(3-76)  Ki  =  -ln(l-p2)  i  =  1.2 . JL 

This  set  of  variables,  referred  to  herein  as  loo  parameters,  are 

part  of  the  definition  of  mutual  information,  and  can  be  used  for 

model  order  determination  by  detection  of  jump  discontinuities  or 

other  such  behaviour  in  the  sequence.  Gelfand  and  Yaglom  (1959) 
have  defined  the  mutual  information  between  the  past  and  future  as 
the  following  parameter, 

JL 

(3-77)  t  =  I 

m-1 


5  2 


Given  this  definition,  the  no cma 1 i Led , mut ua ] _ information  parameter 

for  an  ith-order  model  (with  i<  N)  is  then  defined  as 


(3-78) 


Tli  = 


T1 


m-1 

JL 

Z 

m-t 


i  =  1.2 . JL 


The  value  of  this  parameter  represents  the  fraction  of  the  mutual 
information  in  the  past  about  the  future  that  is  retained  by  the 
state  in  an  ith-order  model  representation  of  the  output  process. 

Table  3-1  lists  the  model  order  determination  criteria 
presented  herein.  In  an  off-line  model  order  determination  mode, 
the  procedure  to  follow  with  each  of  the  criteria  is  to  examine 
the  sequence  of  criteria  parameter  values  for  discontinuities, 
diminishing  returns,  etc.,  and  to  select  the  model  order  for  which 
a  maximum  of  information  is  retained.  In  an  on-line  mode,  one 
procedure  to  follow  with  each  of  the  criteria  is  to  select  the 
model  order  which  corresponds  to  the  criterion  value  that  meets  or 
exceeds  a  pre-selected  threshold.  As  an  example  consider  the  NRS 
parameters.  For  this  criterion,  the  model  order  which  corresponds 
to  the  parameter  value  which  meets  or  exceeds  a  threshold  such  as 
0.95  is  selected.  Another  procedure  is  to  define  a  threshold 
which  is  applied  to  the  increase  in  value  that  occurs  between  two 
consecutive  values  of  the  criterion  parameter.  A  change  of  a  few 
percent  is  a  reasonable  threshold  value  in  many  cases. 


CRITERION  DESCRIPTION 

SYMBOL 

Canonical  correlations 

(Pi) 

Normalized  running  sum  of  canonical  correlations 

Squared  canonical  correlations 

{pf} 

Normalized  running  sum  of  squared  canonical  correlations 

{NRSSi} 

Log  parameters 

Normalized  mutual  information  parameters 

{Tl;} 

Table  3-1.  List  of  candidate  model  order  determination  criteria. 

An  important  issue  related  to  model  order  is  the  selection  of 
the  number  of  block  columns  (rows)  in  the  block  Hankel  matrix,  L. 
Based  on  the  rank  properties  of  the  block  Hankel  matrix,  the  value 
for  L  should  be  selected  to  satisfy 

(3-79)  JL  >  Ng 

where  Ng  is  the  expected  (or  true)  model  order.  If  such  a  value 
is  not  available,  the  best  guess  at  an  upper  bound  for  the  true 
model  order  should  be  used. 
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4 . 0  INNOVATIONS  SEQUENCE  GENERATION 

In  the  approach  pursued  in  this  program,  an  unknown  system  of 
the  form  (2-2)  is  modeled  as  an  innovations  representation  (2-29). 
Thus,  once  the  innovations  model  parameters  have  been  identified, 
an  optimal  Kalman  filter  can  be  configured  to  generate  the 
innovations  sequence,  {£(n)},  for  the  likelihood  ratio  calculations. 

The  approach  described  in  this  section  is  applied  to  the 
observation  data  under  each  of  the  two  hypotheses. 

Any  one  of  several  equivalent  Kalman  filter  formulations  can 
be  applied  to  generate  the  innovations  sequence.  However,  the 
one-step  predictor  formulation  offers  significant  advantages  in 
the  context  of  the  intended  application  (Anderson  and  Moore, 
1979) .  Specifically,  the  one-step  predictor  formulation  generates 
the  innovations  sequence  and  the  filter  state  update  with  a  simple 
structure  in  the  case  where  the  input  and  output  noises  are 
correlated  (S^[0]  in  Equation  (2-5a)  )  ,  and  thus  imposes  less  real¬ 
time  computational  requirements  than  other  formulations.  Also, 
the  model  identification  algorithm  generates  the  parameters  for 
the  innovations  model.  Thus,  the  one-step  predictor  formulation 
is  adopted  in  this  work.  Strictly  speaking,  the  terminology  "one- 
step  predictor"  should  be  used  hereafter,  but  use  of  the  term 
"Kalman  filter"  is  accepted  universally.  Both  terms  are  used 
herein . 

The  steady-state  one-step  predictor  formulation  for  the 
innovations  model  (2-29)  is  a  linear,  t ime- invar iant  system 
described  by  the  following  equations; 

(4-la)  a(n+11n)  =  Fa(n|n-1)  +  K£(n)  n  >  Hq 

(4-ib)  £(n)  =  x(n)  -  x(n|n-1)  =  x(n)  -  H^a(nln-I)  n  >  Oq 
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( 4-lc) 


fl(nolno*1 )  =  Q 


Here  li(n+1|n)  is  the  estimate  of  the  innovations  model  state  vector 
at  time  n+1  based  on  observation  data  up  to  time  n,  £(n|n-1)  is  the 
estimate  of  the  observation  vector  at  time  n  based  on  observation 
data  up  to  time  n-1 ,  and  £(n)  is  the  innovations  associated  with  the 
observation  X(n).  Matrix  K  is  the  steady-state  filter  gain  matrix. 
The  filter  initial  condition  is  set  equal  to  zero  because  the 
innovations  model  initial  condition  is  zero,  Equation  (2-29c) .  A 
block  diagram  of  the  Kalman  filter  is  presented  in  Figure  4-1, 
displaying  the  channel  output  vector  as  input,  and  the  innovations 
sequence  vector  as  output . 


Figure  4-1.  Kalman  filter  block  diagram,  emphasizing  the 
innovations  sequence  generation  filter  function. 


The  steady-state  filter  is  an  approximation  to  the  optimal 
time-varying  filter.  If  the  channel  output  process  is  in  steady- 
state,  this  approximation  provides  acceptable  performance. 
Additionally,  the  steady-state  filter  provides  a  significant 
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reduction  in  the  real-time  computational  requirements  over  the 
time-varying  filter.  In  the  cases  where  the  channel  output 
process  is  not  in  steady-state,  filter  performance  is  suboptimal, 
and  the  degree  of  loss  of  optimality  needs  to  be  ascertained. 
Such  a  determination  is  a  topic  for  future  research.  A  related 
issue  involves  filter  initialization  transient  effects.  Since  the 
steady-state  filter  gain  is  used,  it  may  be  necessary  to  neglect 
the  first  Nj  filter  outputs  for  each  data  batch.  Determination  of 
the  value  Nj  can  be  carried  out  via  analysis  and  simulation,  and  is 
also  a  topic  for  future  research. 

Anderson  and  Moore  (1979)  show  that  the  filter  estimation 
error  for  an  innovations  model  is  zero  at  all  times.  That  is, 

(4-2)  ffl(n+1|n)  =  a{n+1) 

Correspondingly,  the  filter  estimation  error  correlation  matrix  is 
zero  also.  This  can  be  inferred  from  the  parallelism  between  the 
innovations  model  (2-29)  and  the  filter  representation  (4-1) . 
Thus,  knowledge  of  the  filter  implies  knowledge  of  the  innovations 
model,  and  viceversa. 
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5 . 0  LIKELIHOOD  RATIO  DETECTION 


A  detection  methoaology  for  complex-valued  multichannel 
Gaussian  processes  has  been  developed  by  Michels  (1991)  in  the 
context  of  innovat ions-based  detection.  This  approach  has  been 
generalized  recently  to  include  a  class  of  non-Gaussian  processes 
known  as  spherically-invariant  random  processes  (SIRPs)  and  using 
linear  estimators  (Rangaswamy,  Weiner,  and  Michels,  1993). 
Michels'  methodology  can  be  applied  directly  to  the  innovations 
sequence  generated  by  the  approach  formulated  herein.  For 
brevity,  only  the  likelihood  ratio  equation  is  presented  here. 

As  discussed  in  Section  4.0,  a  Kalman  filter  (one-step 
predictor)  is  determined  for  each  of  the  two  hypotheses  based  on 
processing  the  multichannel  data.  The  model  order  for  the 
alternative  hypothesis  (H-i)  filter  is  chosen  to  be  larger  than  the 
model  order  for  the  null  hypothesis  (Hq)  filter.  For  each 
hypothesis  filter,  denote  the  innovations  sequence,  Equation  (4- 
Ib) ,  as 

(5-1)  £(niHi)  =  x(n)  -  x(n|n-1:Hi)  =  x(n)- H^a(n|n-1:Hj)  i  =  0,  1 

The  steady-state  correlation  matrix  of  the  innovations  is  denoted 
as  a  (Hi) ,  and  is  defined  in  Equation  (3-60) . 

Let  0(Ho,H  i)  denote  the  .multichannel  likelihood  ratio  as 
defined  by  Michels  (1991)  for  the  Gaussian  signal  case.  Then,  the 
log-likelihood  ratio  (LLR)  can  be  expressed  as  follows, 

(5-2)  in[0(Ho,H,)]  =  £  I  +  £“(nlHo)Q‘'(Ho)£(n|Ho) 

-  £‘<(n|H,)£2'(H,)£(n|H,)] 
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The  LLR  is  compared  to  a  threshold,  T,  which  is  calculated 
adaptively  to  maintain  a  constant  false  alarm  rate  (CFAR) , 


r  ,  (  >  T  select  H, 

(5-3)  ln[©(Ho.Hi)]  = 

\  <  T  select  Hq 

A  candidate  CFAR  approach  with  demonstrated  good  performance 
calculates  the  median  of  a  set  of  the  LLR  values  from  a  number  of 
adjacent  range  cells  (at  the  same  azimuth)  on  both  sides  of  the 
cell  in  question,  and  scales  the  calculated  median  value  by  a 
pre-determined  constant  to  provide  the  desired  false  alarm  rate 
(Metford  and  Hay)cin,  1985)  . 

The  LLR  expression  has  to  be  modified  if  optimal  time-varying 
filters  are  used  instead  of  the  steady-state  filters.  In  such 
cases  the  modification  is  straightforward,  and  involves  replacing 
the  steady-state  correlation  matrices  of  the  two  innovations  by 
their  time-varying  values. 

Alternative  expressions  for  the  log-li)celihood  ratio  can  be 
generated  based  on  factorization  of  the  innovations  correlation 
matrix  and  spatial  whitening  of  the  innovations  process.  This 
includes  Cholesky  factorization,  LDU  decomposition,  and  SVD .  The 
first  two  techniques  have  been  described  by  Michels  (1991),  and 
lead  to  simplified  LLR  expressions.  The  SVD  technique  is  derived 
here  . 


Consider  the  steady-state  innovations  correlation  matrices 
for  each  of  the  two  hypotheses  and  carry  out  an  SVD  on  each 
correlation  matrix.  This  results  in  the  following  decompositions: 

(5-4)  Q(Hi)  =  i  =  0.  1 
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where  matrix  Vj  is  a  JxJ  unitary  matrix,  and  Z,  is  a  diagonal  matrix 
with  real-valued,  positive  elements  arranged  along  the  diagonal  in 


decreasing  order 

of  magnitude 

(it  is 

assumed 

herein 

that  the 

correlat 

ion  matrix  of  the  innovations 

sequence 

has  full  rank)  . 

That  is, 

1 

o 

o 

(5-5a) 

2:i  = 

0 

4  0 

i  =0,  1 

0 

0  ...  ' 

(5-5b) 

4  = 

>  .  . .  >  0^  >  0 

i=0.  1 

Since  matrix  Vj  is 

unitary,  the 

determinant  and 

inverse 

functions 

of  Q(Hi) 

are  obtained  easily  as 

(5-6) 

a-’(Hi)  = 

< 

< 

—  X 

i  =  0,  1 

(5-7) 

|Q(Hi)|  = 

k«1 

i  =  0.  1 

Now  make  a  linear  transformation  on  the  innovations  sequence  using 
the  unitary  matrix  Vj,  to  obtain 

(5-8)  y(n|Hi)  =  vf£(nlHi)  i  =  0, 1 

The  transformed  innovations  sequence,  {y{n|Hj)},  is  uncorrelated 
spatially  and  temporally  (recall  that  {£(n|Hj)}  is  uncorrelated 
temporally),  with  correlation  matrix  Zj.  Transformation  of  a  J- 
dimensional  vector  by  a  unitary  matrix  rotates  the  vector  in  the  J- 
dimensional  space,  but  does  not  alter  its  magnitude.  Thus,  the 


60 


spatial  whitening  transformation  does  not  alter  the  variance  of 
the  elements  of  the  innovations  vector. 


Substituting  Equations  (5-4)  through  (5-8)  into  Equation  (5- 
2)  results  in  the  following  LLR  expression 


(5-9) 


,  Nfr  J 

)]  =  S  I 

r 

In 

’  <  ' 

,  |v»(n|Ho)|^  |vk(n|H,)r 

n-Ho  k-1 

- 

-  <  - 

^Ok  <  J 

where  V|^(n|Hj)  denotes  the  kth  element  of  y(n|Hj).  This  LLR  is  of  the 

same  form  as  the  LLR  derived  by  Michels  (1991)  for  spatial 
whitening  of  the  innovations  using  an  LDU  decomposition. 
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6 . 0  SOFTWARE  SIMULATION 


The  identification  and  filtering  algorithms  described  in  the 
preceding  sections  have  been  programmed  in  FORTRAN  77  for  Apple 
Macintosh  processors.  Support  software  for  the  validation  and 
execution  of  the  routines  has  been  generated  also.  The  support 
software  includes  signal  generation  routines,  auxiliary  routines 
for  validation,  and  code  for  miscellaneous  calculations.  The 
identification  algorithm  makes  use  of  the  SVD .  An  SVD  subroutine 
for  complex-valued  matrices  was  obtained  from  a  version  of  the 
LINPACK  software  package  (Dongarra  et  al . ,  1979).  Separate  code 
was  written  and  exercised  to  validate  the  LINPACK  routines  before 
incorporation  into  the  main  algorithm  code.  The  signal  generation 
code  uses  a  Gaussian  random  number  generator  obtained  from  the 
text  by  Press  et  al.  (1989).  Sample  realizations  generated  by 
this  code  were  tested  for  whiteness  and  gaussianity. 

6 . 1  S,ol,twai:.g _ Yalidatlon 


Code  validation  was  carried  out  in  two  steps.  First,  all 
subroutines  and  select  segments  of  code  were  validated 
individually.  Second,  the  complete  package  was  validated  using 
examples  generated  for  that  purpose.  The  examples  consisted  of 
system  models  with  a  simple  structure  so  that  the  computer  output 
could  be  predicted.  Both  real-valued  and  complex-valued  examples 
were  generated.  One  particular  example  used  is  the  second-order 
system  defined  by  the  following  matrix  parameters  (for  a  system 
model  of  the  form  (2-2)) : 


F  = 


1 

; 

21 


=  G  =  =  Q  =  C  =  I2 
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This  model  was  used  to  generate  a  random  vector  sequence  to 
validate  various  aspects  of  the  software.  For  example,  defining 
matrix  F  with  f^i  =  f-(2  ”  ^22  “  ®  ^21  =  ^  generates  an  output  vector 

sequence  that  consists  of  white  noise  in  each  output  channel,  but 
the  two  channels  are  correlated  from  one  instant  to  the  next  (the 
correlation  is  due  to  the  coupling  induced  by  the  non-zero  (2,1) 
element  of  F)  .  The  output  of  the  identification  program  should 

indicate  a  first-order  model,  with  the  first  diagonal  element  of 
matrix  approximately  equal  to  0.7071,  and  low  values  for  the 

remaining  diagonal  elements.  This  was  the  result  obtained. 
Complex-valued  test  cases  using  this  sample  model  were  generated 
by  letting  F  be  a  diagonal  matrix  with  the  desired  complex-valued 
poles  along  the  diagonal. 

During  validation  and  testing  it  was  observed  that  system 
poles  along  the  real  axis  are  more  difficult  to  estimate  than 
poles  with  an  imaginary  component.  This  is  common  to  most 
identification  algorithms.  It  was  observed  also  that  poles  close 
to  the  unit  '  axis  (in  the  complex  Z  plane)  are  estimated  more 
accurately  than  poles  close  to  the  origin.  This  is  due  to  the 
fact  that  the  closer  that  a  pole  is  to  the  origin,  the  faster  the 
decay  of  its  response  to  an  excitation. 

6 . 2  Analyses  and  Simulation  Results 

The  software  has  been  exercised  also  with  cases  generated 
using  multichannel  AR  models  provided  by  the  program  monitor  at 
RL,  Dr.  James  H.  Michels.  These  cases  consist  of  signal  only, 
clutter  only,  signal  +  noise,  clutter  +  noise,  and  signal  + 
clutter  +  noise.  In  all  cases  the  signal,  clutter,  and  noise 
processes  are  statistically  independent  of  each  other. 
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Signal  ,.AR-MQ<l£,l 


The  signal  model  is  a  complex-valued,  two-input,  two-output 
AR  model  of  order  2  with  the  following  matrix  parameters, 

SU(n)  =  -  A?(1)iu(n-1)  ■  A?(2)iu(n-2)  +  Us(n) 


As(1) 


1.6290  -j  1.4241x10'^ 
1.3733x10'®  +j  3.8202x10'^^ 


1.3733x10'®  +j  3.8202x10'^® 
1.6290  -  j  1.4241x10'^ 


a”(2) 


0.80996  -  j  1 .41 62x10'^  1 .5259x1  O'®  -  j  9.0949x1 0'^^ 

1.5259x10'® -j  9.0949x10'^®  0.80996  -  j  1 .4162x10'^ 


The  input  to  the  signal  AR  recursion,  {Us('^)}/  is  a  zero-mean,  unit 
variance  white  noise  sequence  with  a  spatial  correlation  structure 
defined  as 


Qs 


0.13038  0.12907 

0.12907  0.13038  . 


This  two-input,  two-output  AR  model  corresponds  to  a  fourth-order 
state  space  model  in  an  innovations  representation  (as  described 
in  Appendix  A)  ,  with  poles  at  the  following  locations  in  the 
complex  Z-plane: 

True  Signal  Model  Poles:  -0.81451  ±  j  0.38282 

-0.81449  ±  j  0.38281 

This  AR  system  was  defined  by  Michels  to  have  a  very  high  channel- 
to-channel  correlation  (~0.99),  which  indicates  that  a  lower-order 
model  could  represent  the  signal  information.  Specifically,  a 
second-order  state  space  model  can  represent  the  signal 
information  well.  Notice  that  the  pole  locations  are  almost 
repeated  roots,  which  indicates  that  the  two  channels  are  almost 
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repeated  roots,  which  indicates  that  the  two  channels  are  al:iost 
identical.  Therefore,  given  a  high-level  of  channel-to-channel 
correlation,  a  reduced-order  model  should  perform  adequately. 

The  AR  process  {y:s(n)}  is  corrupted  by  a  zero-mean,  unit- 
variance  white  noise  sequence  to  give  the  noise-corrupted 
channel  output  sequence  as 

2<s(n)  =  y:s(n)  +  ^(n) 

For  this  noise  model  and  the  signal  model  given  above,  the  signal- 
to-noise  ratio  (SNR)  is  approximately  3  dB . 

Consider  the  problem  of  representing  the  AR  signal  in 
additive  white  noise  with  a  state  space  model  (see  Appendix  A)  . 
The  channel  output  noise,  Mn)},  alters  the  parameters  of  the  state 
space  model  designed  for  the  AR  signal  {y«(n)}  only,  but  {>U(n)}  can  be 
represented  as  the  output  of  a  state  space  model.  That  is,  {yg(n)} 

is  represented  as  the  output  of  an  innovations  model,  but  the 
model  for  (Xg(n)l,  which  includes  the  additive  noise  {w(n)},  is  not  an 
innovations  model  (there  is  an  innovations  model  for  {2<s(n)}^  it 
is  different  from  the  innovations  model  for  )  .  This  is  a 
manifestation  of  the  well-known  fact  that  an  AR  process  corrupted 
by  additive  output  white  noise  is  no  longer  an  AR  process.  In 
contrast,  the  state  space  model  remains  a  valid  representation  of 
the  signal  even  after  the  addition  of  a  new  noise  source.  The  AR 
model  class  is  a  subset  of  the  state  space  model  class;  thus,  the 
state  space  model  class  can  be  expected  to  provide  a  better  fit 
than  the  AR  model  class  for  a  wide  range  of  systems  and 
applications  where  independent  measurement  noise  is  present. 
Additionally,  state  space  identification  algorithms  can  be 
expected  to  deliver  comparable  performance  results  using  a  lower 


equivalent  model  order  than  algorithms  based  on  time  series 
models . 

Clutter  AR  Model 

The  clutter  model  is  a  complex-valued,  two-input,  two-output 
AR  model  of  order  2  with  the  following  matrix  parameters, 

)ic(n)  =  -  A?(1  )ye(n-1 )  -  A?(2))t(n-2)  +  u<,(n) 


Ac(1> 


-1.0430 

0.0 


0.0 

-1.0430 


A^(2) 


0.4900  0.0 

.  0.0  0.4900. 


The  input  to  the  clutter  AR  recursion,  {Uc(n)}  ,  is  a  zero-mean,  unit 
variance  white  noise  sequence  with  a  spatial  correlation  structure 
defined  as 


1.5502 

.  0.0 


0.0 

1.5502. 


This  two-input,  two-output  AR  model  corresponds  to  a  fourth-order 
state  space  model  in  an  innovations  representation  {see  Appendix 
A),  with  poles  at  the  following  locations  in  the  complex  2-plane; 

True  Clutter  Model  Poles:  0.5215  ±  j  0.4669 

0.5215  ±  j  0.4669 

The  clutter  AR  coefficient  values,  the  noise  covariance  values, 
and  the  diagonal  structure  of  this  AR  system  indicate  that  the  two 
channels  are  uncorrelated.  Thus,  a  fourth-order  state  space  model 
can  represent  the  clutter  information  well.  Notice  that  the  pole 
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locations  are  repeated  roots,  which  indicates  that  the  two 
channels  are  identical. 

The  clutter  AR  process  {yc('^)}  is  corrupted  by  the  zero-mean, 
unit-variance  white  noise  sequence  to  give  the  noise- 

corrupted  channel  output  sequence  as 

Xc(n)  =  yc(n)  +  w(n) 

For  this  noise  model  and  clutter  model  the  clutter-to-noise  ratio 
(CNR)  is  approximately  6  dB . 

Selected  Simulation  Results 

The  identification  and  filtering  software  was  exercised  with 
the  signal  plus  noise  sequence,  {Xs(^)}-  Calculated  values  of  the 

various  model,  order  criteria  indicate  that  a  second-order  state 
space  model  is  a  good  approximation  to  this  system,  as  expected. 
Plots  for  two  different  criteria  are  presented  in  Figures  6-1  and 
6-2  (all  plots  herein  are  for  single-realization  cases) . 
Specifically,  Figure  6-1  shows  the  canonical  correlations,  and 
Figure  6-2  shows  the  log  parameters  of  Equation  (3-76) .  In  both 
figures  the  abscissa  represents  model  order.  Notice  that  the  log 
parameters  provide  an  easier  determination  of  model  order  than  the 
canonical  correlations.  This  has  been  observed  to  be  the  case  in 
most  e.xamples  considered  thus  far.  The  same  assessment  is  true 
also  for  the  other  two  criteria  that  are  related  to  these  two 
criteria  (normalized  running  sum  of  canonical  correlations  and 
normalized  mutual  information,  respectively) .  The  plot  of  the 
squared  canonical  correlations  criterion  is  very  similar  to  the 
plot  of  the  log  parameters.  Figure  6-2.  This  also  has  been 
observed  in  most  examples  considered  thus  far. 
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Canonical  correlations 


Signal  Plus  Noise  Case  (SNR  =  3  dB) 


Figure  6-1.  Canonical  correlations  for  the  signal  plus  noise  case 

(SNR  =  3  dB  conditions) . 


Signal  Plus  Noise  Case  (SNR  =  3  dB) 


»-cvjco'^io«or>'00050'i-CNjco^in<or>.ooo>o»-cMco^ 

i 


Figure  6-2.  Log  parameters  criterion  for  the  signal  plus  noise 

case  (SNR  =  3  dB) . 
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Based  on  the  above  discussions,  model  order  2  was  selected 
for  the  analyses  and  simulations  involving  the  AR  signal  in  white 
noise.  Figure  6-3  is  a  plot  of  the  real  and  imaginary  parts  of 
the  first  element  of  a  single  realization  of  the  innovations 
vector  process,  (ei(n)},  generated  using  a  filter  of  order  2.  The 

filter  parameters  were  identified  using  a  total  of  25  output 
correlation  matrix  lags,  including  the  zero-lag  correlation 
matrix.  This  corresponds  to  L=12  in  the  block  Hankel  matrix.  The 

output  correlation  matrix  lags  were  estimated  using  a  single 
realization  of  the  output  process  with  a  duration  of  Ny=2,500 

output  sequence  vectors.  Only  the  first  500  points  are  shown  in 
Figure  6-3  (representing  one-fifth  of  the  available  results),  but 
these  points  are  representative  of  the  total  innovations  process. 
The  innovations  sequence  appears  to  be  unbiased,  with  a  calculated 
sample  mean  of 

.  0.0327  -  j  0.0081 
^  ^  [0.0063  +j  0.0157 

Notice  also  the  high  degree  of  "whiteness"  exhibited  by  the 
innovations.'  The  second  element  of  the  innovations  vector 
sequence,  {£2(^)1'  behaves  similarly. 

The  zero-lag  innovations  correlation  matri.x  identified  by  the 
software  using  Equation  (3-61)  is 

1.5328  0.5089  +j  0.00501 
0.5089  -  j  0.0050  1.5326  J 

and  agrees  very  well  with  the  sample  correlation  values.  Several 
simulation  runs  were  made  using  multiple  sample  realizations  of 
the  same  length  and  filter  order  two.  In  all  cases  the  results 
indicate  clearly  a  white  innovations  process. 
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imaginary  component  of  innovations  real  component  of  innovations 


first  element  of  innovations  vector  for  case  of  signal  plus  noise 
(order=2,  SNR=3clB) 


,4 

>  ,  '4< 


or. 


time  index,  n 


first  element  of  innovations  vector  for  case  of  signal  plus  noise 
(order  2.  SNR=3dB) 


time  index,  n 


Figure  6-3 .  Real  and  imaginary  parts  of  the  first  element  of 
innovations  sequence  vector  for  the  case  of  signal  plus  noise 

(SNR  =  3  dB  conditions) . 
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Identification  algorithm  performance  can  be  assessed  by 
examining  the  roots  of  the  identified  innovations  model  system 
matrix,  F.  The  scatter  plots  in  Figure  6-4,  which  correspond  to 
results  obtained  for  ten  distinct  realizations,  illustrate  the 
parameter  identification  capability  of  the  algorithm.  These 
scatter  plots  show  the  ten  identified  root  pairs,  all  in  close 
proximity  to  the  true  roots  (recall  that  the  true  roots  are 
located  at  -0.8145  ±  j  0.3828).  All  the  identified  roots  are  at  a 
distance  less  than  1.5%  of  the  true  values,  and  most  are  much 
closer  than  that. 


Root  No.  1 


-0.35- 


t: 

to  « 

CL  -0.37- 

CO 

c 

O) 

g  -0.39' 


-0.41 

-0.84  -0.82  -0.8  -0.78 

real  part 


X 

V 

X  ^ 

■ 

X 

Root  No.  2 


real  part 


Figure  6-4 .  Scatter  plot  of  real  and  imaginary  parts  of 
identified  model  poles  for  ten  distinct  realizations  of  signal 
plus  noise  (SNR  =  3  dB  conditions) . 


The  software  was  used  also  to  model  and  analyze  the  clutter 
plus  noise  sequence,  {2<c('^)}  •  this  case  at  a  CNR  of  20  dB,  the 

concensus  of  the  model  order  criteria  indicate  a  fourth-order 
state-space  model,  as  expected.  A  plot  of  the  normalized  running 
sum  of  the  canonical  correlations  (parameter  NRSj)  is  presented  in 
Figure  6-5,  and  a  plot  of  the  normalized  mutual  information  is 
presented  in  Figure  6-6  (both  of  these  figures  present  single- 
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realization  cases) .  Notice  in  Figure  6-6  that  there  is  a 
significant  increase  in  mutual  information  as  the  model  order  is 
increased  up  to  fourth-order,  but  for  fifth-order  and  beyond  the 
increase  in  mutual  information  is  small  compared  to  the  prior 
increases .  Such  is  not  the  case  with  the  NRS  j  criterion,  as 
evident  in  Figure  6-5.  The  plot  for  the  normalized  running  sum  of 
the  canonical  correlations  squared  {parameter  NRSSj)  is  very 
similar  to  the  plot  of  the  normalized  mutual  information  (Figure 
6-6) .  As  in  the  case  of  signal  plus  noise,  criteria  which  involve 
the  canonical  correlations  in  a  linear  manner  are  not  as  useful  as 
criteria  based  on  nonlinear  functions  of  the  canonical 
correlations.  These  results  together  with  the  knowledge  of  the 
lack  of  channel  correlation  indicate  that  a  fourth-order  model  is 
a  good  approximation  to  this  system. 


Figure  6-5.  Normalized  running  sum  of  canonical  correlations 
criterion  for  the  clutter  plus  noise  case  (CNR  =  20  dB) . 
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Clutter  Plus  Noise  Case  (CNR  =  20  dB) 


Figure  6-6.  Normalized  mutual  information  criterion  for  the 
clutter  plus  noise  case  (CNR  =  20  dB) . 


Based  on  the  above  discussion,  model  order  four  was  selected 
for  the  state-  space  representation  of  the  clutter  AR  process  in 
additive  white  noise.  Plots  of  the  real  and  imaginary  parts  of 
the  first  element  of  the  innovations  vector  process,  {Ei{n)},  are 

presented  in ■ Figure  6.7.  These  results  were  generated  using  a 
fourth-order  filter  and  6  dB  CNR  conditions.  The  other  simulation 
parameters  are  the  same  as  in  the  signal  plus  noise  case. 
Specifically,  a  total  of  25  output  correlation  matri.x  lags, 
including  the  zero-lag  correlation  matrix,  were  used  to  identify 
the  filter  parameters  .  This  corresponds  to  L  =  12  i.n  the  block 
Hankel  matrix.  Also,  the  output  correlation  matrix  lags  were 
estimated  using  a  single  realization  of  the  output  process  with  a 
duration  of  Nj=2,500  output  sequence  vectors.  Only  the  first  500 

points  are  shown  in  Figure  6-7  (representing  one-fifth  of  the 
available  innovations  sequence  in  this  run) ,  but  these  points  are 
representative  of  the  total  innovations  sequence.  Both  components 
(real  and  imaginary)  of  the  sequence  {Ei(n)}  are  unbiased,  as 

indicated  by  the  sample  mean, 

73 


£(n) 


0.0586  +  j  0.0321  ; 
-  0.1025  -j  0.0271  i 


An  estimate  of  the  real  and  imaginary  parts  of  the  sample  auto¬ 
correlation  function  of  {ei(n)}  of  Figure  6-7  is  given  in  Figure  6-8. 

The  behaviour  of  the  real  part  is  representative  of  a  white 
innovations  sequence;  an  impulse  at  lag  n  =  0,  and  approximately 
equal  to  zero  everywhere  else.  The  imaginary  part  exhibits  low- 
amplitude  oscillations  about  zero,  also  as  expected  of  a  white 
innovations.  Several  distinct  output  sequence  realizations  were 
generated  and  processed  using  the  same  parameters,  and  the 
performance  was  similar  in  all  cases.  The  zero-lag  innovations 
correlation  matrix  estimated  using  Equation  (3-61)  is 


Q  = 


3.2694 

-  0.0888  +  i  0.0073 


-  0.0888  •  j  0.0073 
3.1369 


Element  (1,1)  of  Q  agrees  within  less  than  1%  with  the  sample 

correlation  value  of  3.290  +  j  0,0  indicated  in  Figure  6-8.  The 
behaviour  of  (£2(0)}  is  similar. 


Figure  6-9  presents  scatter  plots  of  the  roots  of  the  fourth- 
order  system  characteristic  equation  for  ten  realizations.  The 
roots  are  clustered  about  the  values  of  the  true  repeated  roots, 
0.5215  ±  j  0.4669,  which  are  close  to  the  center  of  the  plots 
shown.  The  largest  root  estimation  error  is  appro;<imately  12.7%. 
This  error  is  larger  than  the  worst  error  in  the  signal  plus  noise 
case,  and  is  due  to  the  greater  difficulty  in  estimating  faster 
modes . 
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imaginary  component  of  innovations  real  component  of  innnovations 


first  element  of  innovations  vector  for  case  of  clutter  plus  noise 
(order  4,  CNR=6dB) 


time  index,  n 


first  element  of  innovations  vect^^r  for  case  of  clutter  plus  noise 
(order  4,  CNR=6dB) 


time  index,  n 


gure  6-7.  Real  and  imaginary  parts  of  the  first  element  of  the 
Lnnovations  sequence  vector  for  the  case  of  clutter  plus  noise 

(CNR  =  6  dB  conditions)  . 


imaginary  component  of  covariance  component  of  covariance 


first  element  of  innovations  covariance  for  null  hypothesis  data 

„  ^  using  null  hypothesis  filter  (order  4.  CNR=6dB) 
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Figure  6-8.  Real  and  imaginary  parts  of  the  auto-correlation 
function  of  the  first  element  of  the  innovations  sequence  vector 
for  the  case  of  clutter  plus  noise  (CNR  =  6  dB) . 


using  null  hypothesis  filter  (order  4,  CNR=6dB) 
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Figure  6-9.  Scatter  plot  of  real  and  imaginary  parts  of 
identified  model  poles  for  ten  distinct  realizations  of  clutter 

plus  noise. 
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Similar  biased  behaviour  has  been  observed  in  other 
estimation  results  (Michels,  1992b),  as  well  as  in  detection 
performance  results  (Michels,  1992a)  obtained  using  time  series 
(AR)  models.  For  the  state-space  approach  pursued  herein, 
unbiased  estimates  with  reduced  variance  can  be  obtained  by 
averaging  several  individual  estimates  and/or  by  increasing  the 
duration  of  the  output  process  realization. 

Various  simulations  were  carried  out  to  obtain  a  first-order 
assessment  of  the  discrimination  capability  of  the  innovations- 
based  methodology  using  the  canonical  correlations  algorithm.  One 
set  of  simulations  involved  designing  a  Kalman  filter  for  each 
hypothesis,  processing  data  corresponding  to  each  of  the  two 
hypotheses  using  both  filters,  and  analyzing  the  resulting  four 
filter  output  sequences  (two  filters,  and  each  filter  processes 
data  sets  corresponding  to  each  of  the  two  hypotheses)  .  These 
results  are  presented  next.  As  before,  all  plots  correspond  to 
single-realization  cases. 

Consider  first  the  case  of  processing  data  from  each  of  the 
two  hypotheses  using  a  null  hypothesis  filter,  corresponding  to 
clutter  +  noise  only.  For  this  case  the  filter  order  is  four,  as 
mentioned  earlier  in  the  clutter  plus  noise  model  discussion. 
Results  are  presented  herein  for  two  sets  of  conditions:  (a)  SNR  = 
3  dB  and  CNR  =  6  dB;  and  (b)  SNR  =  3  dB  and  CNR  =  20  dB .  For  each 
set  of  conditions  the  procedure  described  next  v;as  followed. 

•  A  realization  of  the  clutter  noise  process  of  duration 
Nj =  2,500  was  generated  and  25  output  correlation  matrix 

lags  were  estimated.  These  correlation  lags  were 
processed  to  design  a  fourth-order  Kalman  filter.  The 
resulting  filter  is  the  filter  for  the  .null  hypothesis 
(signal  not  present). 
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•  The  null  hypothesis  filter  was  applied  to  a  clutter  + 
noise  process  sequence  of  duration  N-j- =  2,500,  and  the 

sample  correlation  matrix  sequence  of  the  filter  output 
sequence  was  calculated.  The  real  and  imaginary  parts 
of  the  (1,1)  element  of  the  resulting  sample  correlation 
matrix  sequence  are  plotted  in  Figure  6-8  for  CNR  =  6  dB 
conditions,  and  Figure  6-10  for  CNR  =  20  dB  conditions. 

Both  sets  of  figures  are  representative  of  the  auto¬ 
correlation  of  a  white  innovations  sequence,  as  expected 
(both  sets  of  figures  show  low-level  energy  content  at 
the  higher  lags) . 

•  The  null  hypothesis  filter  was  applied  to  a  combined 
signal  +  clutter  +  noise  process  sequence  (alternative 
hypothesis  case)  of  duration  Nj  =  2,500,  and  the  sample 

correlation  matrix  sequence  of  the  filter  output 
sequence  was  calculated.  In  this  case,  however,  the 
sequence  is  not  a  true  innovations  sequence  because  the 
filter  -is  not  optimal  for  this  process.  The  real  and 
imaginary  parts  of  the  (1,1)  element  of  the  resulting 
sample  correlation  .matrix  sequence  are  plotted  in  Figure 
6-11  for  CNR  =  6  dB  conditions,  and  Figure  6-12  for  CNR 
=  20  dB  conditions.  Both  of  these  figures  show  a  marlced 
deviation  from  the  expected  auto-correlation  for  a  white 
innovations  sequence. 

In  the  discussions  and  results  presented  above  the  (2,2)  element 
of  the  sample  correlation  matrix  is  not  referred  to.  This  is  due 
to  the  fact  that  its  behaviour  is  very  similar  to  the  behaviour  of 
the  (1,1)  element. 
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In  continuation  of  the  first-order  assessment  of  the 
discrimination  capability  of  the  canonical  correlations  approach, 
consider  now  the  case  of  processing  data  from  each  of  the  two 
hypotheses  using  an  alternative  hypothesis  filter,  corresponding 
to  the  combined  process  of  signal  +  clutter  +  noise.  Since  the 
signal  and  clutter  are  uncorrelated  in  this  set  of  examples,  a 
sixth-order  state  space  model  is  required  for  the  combined 
process.  As  before,  results  are  presented  for  two  sets  of 
conditions:  (a)  SNR  =  3  dB  and  CNR  =  6  dB;  and  (b)  SNR  =  3  dB  and 
CNR  =  20  dB .  For  each  set  of  conditions  the  procedure  described 
next  was  followed  (all  plots  are  for  single-realization  cases) . 

•  A  realization  of  the  combined  signal  +  clutter  +  noise 
vector  process  of  duration  Nj  =  2,  500  was  generated,  and 

25  lags  of  the  output  correlation  matrix  sequence  were 
estimated.  These  lags  were  processed  to  design  a  sixth- 
order  Kalman  filter.  The  resulting  filter  is  the  filter 
for  the  alternative  hypothesis  (signal  present). 

•  The  alternative  hypothesis  filter  was  applied  to  a 
combined  process  sequence  of  duration  Ni-=  2,500,  and  the 

sample  correlation  matrix  sequence  of  the  filter  output 
sequence  was  calculated.  The  real  part  of  the  (1,1) 
element  of  the  resulting  sample  correlation  matrix 
sequence  is  plotted  in  Figure  6-13  for  CNR  =  6  dB 
conditions,  and  Figure  6-14  for  CNR  =  20  dB  conditions. 

Both  figures  present  correlation  sequences  which 
correspond  to  white  innovations  sequences,  as  expected 
(both  figures  show  low-level  energy  content  at  the 
higher  lags) . 

•  The  alternative  hypothesis  filter  was  applied  to  a 
clutter  +  noise  process  sequence  (null  hypothesis  case) 
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of  duration  Ny =  2,500,  and  the  sample  correlation  matrix 
sequence  of  the  filter  output  was  calculated.  In  this 
case,  however,  the  sequence  is  not  a  true  innovations 
sequence  because  the  filter  is  not  optimal  for  this 
process.  The  real  part  of  the  (1,1)  element  of  the 
resulting  sample  correlation  matrix  sequence  is  plotted 
in  Figure  6-15  for  CNR  =  6  dB  conditions,  and  Figure  6- 
16  for  CNR  =  20  dB  conditions.  The  correlation  sequence 
in  each  of  the  figures  corresponds  to  a  colored  process, 
and  not  to  a  white  innovations  sequence.  Such  is  the 
expected  result . 

Figures  6-13  through  6-16  do  not  include  the  imaginary  part  of  the 
sample  correlation  sequence  because  it  is  similar  to  the  imaginary 
part  of  the  sample  correlation  sequence  presented  in  the  preceding 
figures.  Also,  in  all  cases  the  behaviour  of  the  (2,2)  element  is 
very  similar  to  that  of  the  (1,1)  element  of  the  sample 
correlation  matrix  sequence,  as  before. 

These  r-esults  indicate  that  the  innovat ions-based  detection 
methodology  using  the  canonical  correlations  identification 
algorithm  can  discriminate  between  data  corresponding  to  each  of 
the  two  hypotheses.  That  is,  a  filter  designed  for  the 
alternative  hypothesis  (signal  +  clutter  +  noise)  generates  a  true 
innovations  sequence  given  a  signal  +  clutter  +  noise  channel 
process,  and  generates  a  colored  output  given  a  clutter  *  noise 
channel  process.  Analogously,  a  filter  designed  for  the  null 
hypothesis  (clutter  +  noise)  generates  a  true  innovations  sequence 
given  a  clutter  +  noise  channel  process,  and  generates  a  colored 
output  given  a  signal  +  clutter  +  noise  channel  process  . 
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first  element  of  innovations  covariance  for  null  hypothesis  using 


time  index,  n 


first  element  of  innovations  covariance  for  null  hypothesis  data 


time  index,  n 


Figure  6-10.  Real  and  imaginary  parts  of  the  auto-correlation 
function  of  the  (1,1)  element  of  the  innovations  sequence  vector 
for  the  case  of  null  hypothesis  data  using  the  null  hypothesis 
filter  (CNR  =  20  dB  conditions) . 
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first  element  of  innovations  covariance  for  alternative  hypothesis 
data  using  null  hypotnesis  filter  (order  4,  CNR=6dB) 
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first  element  of  innovations  covariance  for  alternative  hypothesis 
Q  ^  data  using  null  hypothesis  filter  (order  4,  CNR=6dB) 
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Figure  6-11.  Real  and  imaginary  parts  of  the  auto-correlation 
function  of  the  (1,1)  element  of  the  filter  output  vector  for  the 
case  of  alternative  hypothesis  data  using  the  null  hypothesis 
filter  (CNR  =  6  dB  conditions) . 
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first  element  of  innovations  covariance  for  alternative  hypothesis 


time  index,  n 


first  element  of  innovations  covariance  for  alternative  hypothesis 


time  index,  n 


Figure  6-12.  Real  and  imaginary  parts  of  the  auto-correlation 
function  of  the  (1,1)  element  of  the  filter  output  vector  for  the 
case  of  alternative  hypothesis  data  using  the  null  hypothesis 
filter  (CNR  =  20  dB  conditions) . 
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first  element  of  innovations  covariance  for  alternative  hypothesis 


time  index,  n 


Figure  6-13.  Real  part  of  the  correlation  function  of  the  (1,1) 
element  of  the  innovations  sequence  vector  for  alternative 
hypothesis  data  using  alternative  hypothesis  filter  (CNR  =  6  dB) 

first  element  of  innovations  covariance  for  alternative  hypothesis 


time  index,  n 


Figure  6-14,  Real  part  of  the  correlation  function  of  the  (1,1) 
element  of  the  innovations  sequence  vector  for  alternative 
hypothesis  data  using  alternative  hypothesis  filter  (CNR  =  20  dB) 
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first  element  of  innovations  covariance  for  null  hypothesis  data 
using  alternative  hypothesis  filter  (order  6,  CNR=6dB) _ 


Figure  6-15.  Real  part  of  the  correlation  function  of  the  (1,1) 
element  of  the  filter  output  vector  for  null  hypothesis  data  using 
the  alternative  hypothesis  filter  (CNR  =  6  dB  conditions) . 
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using  alternative  hypothesis  filter  (order  6,  CNR=20dB) 
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Figure  6-16.  Real  part  of  the  correlation  function  of  the  (1,1) 
element  of  the  filter  output  vector  for  null  hypothesis  data  using 
the  alternative  hypothesis  filter  (CNR  =  20  dB  conditions) . 
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7 . 0  CONCLUSIONS  AND  RSCOMMENOATIONS 


The  work  carried  out  in  this  program  emphasized  the 
development  and  analysis  of  a  state  space  methodology  and 
algorithm  for  the  model-based  multichannel  defection  problem  in 
the  context  of  radar  system  applications.  Application  of  state 
space  techniques  for  multichannel  detection  in  radar  systems  is 
one  novel  aspect  of  the  work  reported  here.  The  state  space  model 
class  is  richer  than  the  time  series  model  class  that  is  used 
often  in  radar  system  applications.  And,  as  demonstrated  in  this 
work,  the  state  space  model  class  can  be  used  to  represent 
effectively  multichannel  radar  signals. 

Another  novel  aspect  of  the  work  is  the  utilization  in  the 
detection  methodology  of  the  canonical  correlations  algorithm 
developed  by  Desai  et  al.  (1985),  which  in  turn  is  based  on  the 
work  of  Akaike  (1974;  1975).  This  algorithm  was  adopted  in  the 
program  for  the  multichannel  radar  output  modeling  and  parameter 
identification  functions.  In  the  process,  the  algorithm  was 
extended  to  the  case  of  complex-valued  radar  system  data,  and  an 
alternative  derivation  of  the  algorithm  was  developed  which  is 
based  on  the  SVD  technique.  The  SVD  is  a  robust  and  stable 
numerical  technique.  Thus,  the  algorithm  offers  numerical  and 
performance  advantages  over  other  techniques . 

A  computer  simulation  was  developed  to  validate  the  algorithm 
and  methodology,  and  to  serve  as  a  testbed  for  evaluation  of  the 
algorithm  in  radar  system  applications.  The  simulation  can  be 
exercised  with  internally-generated  sample  multichannel  output 
data,  or  with  externally-provided  data.  Extensive  tests  were 
carried  out  to  validate  the  code. 
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Simulation-based  analyses  carried  out  to  date  demonstrate  the 
feasibility  of  the  SSC  state  space  approach  for  multichannel 
identification  and  detection  in  radar  system  applications.  The 
algorithm  has  demonstrated  the  capability  to  discriminate  between 
signal  plus  clutter  plus  noise  and  clutter  plus  noise  in  an 
innovat ions-based  detection  algorithm  formulation  for  the 
multichannel  detection  problem.  Several  cases  have  been  analyzed 
at  various  SNR  and  CNR  levels,  and  in  all  cases  simulated  thus  far 
discrimination  has  been  demonstrated. 


In  the  process  of  completing  the  work  reported  here  several 
areas  have  been  identified  for  further  research  and  development  in 
future  programs.  These  areas  are  summarized  below. 

Processor  Requirements  Definition 

Determination  of  the  true  potential  of  the  SSC  approach  for 
radar  system  applications  requires  the  establishment  of  a  detailed 
set  of  requirements  for  various  radar  problems  such  as  space/time 
processing  in  a  radar  array  system  and  the  fusion  of  data  from 
multiple  distinct  radar  systems. 

AdcLiLlonai  Analyses  .and- Detailed  Algorithm  fgrmulatign 

The  analyses  listed  below  are  required  to  generate  a  detailed 
algorithm  definition  for  the  requirements,  and  to  provide  a 
precise  assessment  of  the  SSC  approach  in  the  context  of  radar 
system  applications  requirements. 

•  The  innovations  model  matrix  parameters  F,  T,  and  H  can 
be  estimated  using  different  equations.  These 
alternative  approaches  need  to  be  evaluated  and  traded 
with  respect  to  computational  efficiency  and  accuracy. 


•  Model  order  selection  criteria  for  on-line  and  off-line 
decisions  need  to  be  evaluated  and  traded  further. 

This  includes  the  ones  discussed  in  Section  3.3. 

•  The  steady-state  Kalman  filter  was  used  in  this  program 
to  generate  the  innovations  sequence.  Alternatively, 
the  time-varying  Kalman  filter  can  be  used.  The  loss 
in  performance,  if  any,  incurred  by  using  the  steady- 
state  approximation  needs  to  be  evaluated.  A  related 
issue  is  the  duration  of  the  transient  effect  in  the 
case  of  the  steady-state  filter. 

•  Key  implementation  parameters  for  radar  system 
applications  need  to  be  established.  This  includes  the 
minimum  required  channel  output  sequence  duration,  and 
the  block  dimension  of  the  block  Hankel  matrix. 

•  Identification  and  detection  performance  should  be 
compared  with  that  of  other  methods.  This  includes 
methods  based  on  time  series  models. 

Once  these  technical  issues  are  addressed,  a  detailed  architecture 
design  can  be  defined. 

Real-Time  Processor  Architecture  Design 

A  real-time  implementation  architecture  for  the  algorithm 
should  be  developed,  and  a  candidate  hardware  implementation 
identified.  Specifically,  the  following  issues  should  be 
addressed . 
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•  Generation  of  an  architecture  design  that  best  meets 
the  features  of  the  detailed  algorithm  design  and  the 
established  processor  requirements .  The  result  may  be 
an  architecture  with  features  different  from  those  in 
existing  processors,  and  which  is  likely  to  consist  of 
various  fundamental  architectures  (systolic;  vector; 
parallel  arrays;  etc.). 

•  Analysis  of  state-of-the-art  processors  to  determine 
which  contemporary  and  next -generation  VLSI  components 
best  match  the  optimized  architecture  design  and  the 
requirements . 

In  addressing  these  issues  the  emphasis  should  be  on  the  most 
computation-intensive  tasks  of  the  algorithm. 
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APPENDIX  A.  STATE  SPACE  REPRESENTATION  OF  TIME  SERIES 


MODELS 

Consider  a  discrete-time,  time-invariant,  complex-valued, 
zero-mean,  random  process  U(n)}  defined  as  the  output  of  the 
following  state  space  system  model 

(A-ia)  ^(n+1 )  =  F^(n)  +  Gu(n) 

(A- lb)  2i(n)  =  H^y(n)  +  D^^(n) 

Vector  recursive  processes  such  as  moving-average  (MA) ,  auto¬ 
regressive  (AR)  ,  and  auto-regressive  moving-average  (ARMA) 
processes  can  be  modeled  with  state  variable  models  (SVMs)  of  the 
form  (A-1) .  The  discussion  herein  is  limited  to  the  particular 
case  where  the  matrix  coefficients  of  the  recursion  are  square 
matrices,  and  the  number  of  output  coefficients  is  equal  to  the 
number  of  input  coefficients.  The  generation  of  a  minimal-order 
SVM  for  a  vector  recursive  process  involves  the  properties  of 
polynomial  matrix  pairs  and  canonical  forms  for  multiple  input, 
multiple  output  SVMs. 

In  contrast,  minimal-order  SVMs  for  scalar  recursive 
processes  (MA,  AR,  ARMA)  can  be  generated  in  a  straightforward 
manner  given  the  recursion  coefficients.  The  SVM  generic  form 
appropriate  for  modeling  scalar  recursive  processes  is 

{A-2a)  ^(n+1)  =  F^(n)  +  flu(n) 

(A-2b)  x(n)  =  ]i^y(n)  +  cj*w(n) 

This  SVM  is  a  single-input,  single-output  system. 


A.  1 


A  scalar  MA  process  of  order  M  is  defined  as 

x(n)  =  X  bkU(n-k) 

k»0 


x(n)  =  bou(n)  +  biu(n-1)  +  b2U(n-2)  +  . . .  +  bMU(n-M) 

where  {u(n)}  is  a  zero-mean  white  noise  sequence.  This  recursion 
can  be  modeled  with  a  state-space  system  of  the  form  {A-2)  with 
input  sequence  {u(n)},  and  state  vector  with  elements  that  are 
determined  by  the  input  sequence, 


'  yi(n)  ■ 

u(n-1) 

y(n)  = 

u{n-M+1) 

L  yM(n)  . 

u(n*M) 

V  n 


The  output  noise  sequence  is  also  equal  to  the  input  noise 
sequence, 


w(n)  =  u(n)  V  n 

which  means  that  the  input  and  output  noise  sequences  in  the  state 
space  model  are  completely  correlated.  Model  parameters  (F,  Q.,  h, 
d)  are  defined  as 


F  = 


0  0 . 0  0 

1  0 . 0  0 

or-.  :  ; 


0  0 

0  0 . 1  0  0 

0  0 . 0  1  0 


*M-1  - 
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fl=l1  = 


1 

2m-i 


d*  =  6*0 

W 

with  ^  denoting  a  1xM  vector  defined  by  M  of  the  MA  recursion 
coefficients, 

=  b2  ...  bi^] 


The  special  form  of  matrix  F  is  one  of  the  possible  four 
variations  of  the  so-called  companion  matrix  form.  Also,  the 
system  parameters,  the  quadruple  (F,  fl,  h,  d)  ,  is  a  variation  of 
the  so-called  controllable  canonical  form.  These  forms  have  the 
minimal  number  of  non-zero  elements  (whereby  the  name  "canonical") 
of  all  possible  SVMs  that  model  the  scalar  MA  process. 

Note  that  the  definition  of  the  state  vector  y,{n)  in  terms  of 
the  sequence  {u(n)}  inherently  defines  the  initial  condition  vector, 
y.(0) .  Once  the  initial  condition  vector  is  defined,  the  state 
propagation.  Equation  (A-2a) ,  provides  for  continued  generation  of 
the  output  process. 

Verification  of  the  above-defined  model  proceeds  as  follows. 
The  form  of  matrix  F  provides  for  continued  "scrolling"  of  the 
input  noise  sequence  as  elements  of  ^(n),  for  all  n.  Validation  of 
the  model  follows  from  {A-2b)  and  the  definition  of  il,  w{n),  and 
y(n).  That  is. 
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x{n)  =  +  d*w(n)  =  +  bjuin) 

Expanding  the  term  b'V(n)  ,  and  substitution  of  the  definition  of  ^(n) 
in  terms  of  the  sequence  {u(n)}  results  in 

x{n)  =  bou(n)  +  b*u(n-1)  +  b2u(n-2)  +  . . .  +  bMU(n-M) 

which  is  the  MA  process  definition.  Model  validation  can  be 
carried  out  also  using  the  transfer  function  concept,  as 
summarized  next . 

Consider  first  the  derivation  of  the  transfer  function  from 
the  MA  process  definition.  Since  the  MA  process  is  a  discrete¬ 
time  process,  the  appropriate  tool  for  the  determination  of  the 
transfer  function  is  the  2-transform.  Application  of  the  2- 
transform  to  the  definition  of  the  MA  model  results  in  the 
expression 


X(2)  =  X  bKZ-'^U(z) 


k«0 


where  2  denotes  the  transform  variable,  and  X(2)  and  U(z)  are  the  2- 
transforms  of  the  sequences  {x(n)}  and  {u(n)},  respectively.  The 
transfer  function  for  this  linear  system  is  then  defined  as 


M 


k>«0 


This  corresponds  to  the  transfer  function  of  an  all-zero  system, 
as  is  well  known. 


The  transfer  function  for  a  single-input,  single-output  state 
variable  model  (A-2)  is  of  the  form 
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T(2)  =  +  d 


The  particular  characteristics  of  matrix  F  and  vector  fl,  lead  to  a 
very  simple  expression  for  the  product  Izi .  F]-’a  ;  namely, 

[zl  -  ^  Oil) 


where  y{z)  is  the  system  characteristic  polynomial  (the  determinant 
of  matrix  [zi  -  F]) . 

Y(z)  =  z^ 

and  Q(z)  is  vector  with  elements  of  the  form  ei(z)  =  that  is, 

a^(z)  =  [zM-l  ...  z2  z  1  ] 

U  .It 

Substitution  of  these  expressions  and  of  n  and  d  in  the  equation 
for  the  transfer  function  leads  to  the  following  result 

Y(z) 

T(z)  =  b*o  +  b;  2  U  b*2Z-2  +  . . .  +  b^z'^^  =  ^  b^z'^^ 

k-O 


This  result  is  identical  to  the  transfer  function  expression 
derived  from  the  definition  of  the  MA  process. 

A .  2  Scalar  _AR  Procesa  Model 


A  scalar  AR  process  of  order  M  is  defined  as 
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M 

x(n)  =  -  X  aj;x{n-k)  +  u(n) 

k*1 

x(n)  =  -  a*^x(n*1 )  -  c4x(n-2)  -  ...  -  aJ|^(n-M)  +  u{n) 

where  {u(n)}  is  a  zero-mean  white  noise  sequence.  This  recursion 
can  be  modeled  with  a  state-space  system  of  the  form  (A-2)  with 
input  sequence  {u(n)},  and  state  vector  with  elements  that  are 
determined  by  the  output  sequence, 


'  yi(n)  ■ 

x(n-1) 

jt(n)  = 

yM-5(r!) 

= 

x(n-M+1) 

L  yM(n)  J 

x(n-M) 

The  output  noise  sequence  is  equal  to  the  input  noise  sequence, 

w(n).=  u(n)  Vn 

This  implies  complete  correlation  between  the  input  and  output 
noise  sequences  in  the  SVM  (as  in  the  case  of  the  MA  model)  . 
Model  parameters  (F,  g,  h,  d)  are  defined  as 
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u 

with  a  denoting  a  vector  with  elements  equal  to  the  AR  recursion 
coefficients, 

a^  =  [ai  4  •••  4|] 

The  system  parameters  quadruple  (F,  fl.,  il,  d)  ,  is  in  controllable 
canonical  form,  as  in  the  MA  model  case. 
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Note  that  the  definition  of  the  state  vector  ijl(n)  in  terms  of 
the  sequence  {x(n)}  inherently  defines  the  initial  condition  vector, 
^(0) .  Once  the  initial  condition  vector  is  defined,  the  state 
propagation,  Equation  (A-2a) ,  provides  for  continued  generation  of 
the  output  process. 

Verification  of  the  above-defined  model  proceeds  as  follows. 
From  (A-2a)  and  the  definition  of  F.  y(n),  y(n+1),  and  g.,  it  follows 

that 

y^^(n+1)  =  -  a^Viln)  -  a*2y2(n)  -  ...  -  +  ^(n) 

y^(n+1)  =  -a^y(n)  +  u(n) 

Also,  it  follows  from  (A-2b)  and  the  definition  of  Jl,  w{n),  and  y(n) 
that 

x(n)  =  h^i:{n)  +  w(n)  =  - 

which  indicates  that  x(n)  *  y^(n+1 ) .  Then,  expanding  the  term  - 
and  substitution  of  the  definition  of  y(n)  in  terms  of  the  sequence 
{x(n)}  results  in 

x(n)  =  -  aiX(n-1)  -  a2x(n-2)  -  ...  -  aJ^x(n-M)  +  u(n) 

which  is  the  AR  process  definition. 

The  transfer  function  approach  can  be  used  also  to  validate 
this  SVM  for  scalar  AR  processes.  Application  of  the  Z-transform 
to  the  definition  of  the  AR  model  results  in  the  e.xpression 


a;2'<X(z)  =  U(z) 


where  Bq  =  1  is  introduced  for  notational  simplicity,  and  X(z)  and 
U(2)  are  the  Z-transforms  of  the  sequences  {x(n)}  and  {Li(n)}, 
respectively.  The  transfer  function  for  this  linear  system  is 
then  defined  as 


T(z)  = 


X(z) 

U(z) 


1 


This  corresponds  to  the  transfer  function  of  an  all-pole  system, 
as  is  well  known. 

Consider  now  the  transfer  function  for  the  state  variable 
model  (A-2).  In  the  present  AR  process  case,  the  system 
characteristic  polynomial  is 

Y(z) »  z^  -t-  a^z^-i  + . . .  +  a*^,.,z  +  a^ 

and  the  particular  characteristics  of  matrix  F  and  vector  Q,  lead 
the  same  simple  expression  for  the  product  [zl  -  F]-^s  ;  namely, 

where  is  as  defined  previously.  Notice  that  the 

characteristic  polynomial  can  be  expressed  as 

7(z)  =  z^^  +  a^Q(z) 

Substitution  of  these  expressions  and  of  il  and  d  in  the  equation 
for  the  transfer  function  leads  to  the  following  result 
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T{z)  = 


h\z) 6  y{z) 
Y(2) 


-  a^e(2)  Y(2)  2^ 

7(2)  ~  7(2) 


T(2)  » 


2  ^7(2)  H 


I 


k.O 


This  is  identical  to  the  transfer  function  expression  derived  from 
the  definition  of  the  AR  process. 

A .  3  acalax _ ARMA  Proeeaa  Model 

A  scalar  ARMA  process  of  order  M  is  defined  as 

M  ^  M  ^ 

x(n)  =  -  X  +  X  ^k^(^**<) 

k-1  k>0 


x(n)  =  -  a*,x(n-1)  -  ...  -  aJ^,x(n-M+1)  -  aJ^x(n-M)  +  bou(n)  +  b*u(n-1)  + 

+  b2u(n-2)  +  . . .  +  bMU{n-M) 

where  {u(n)}  is  a  zero-mean  white  noise  sequence.  This  recursion 
can  be  modeled  with  a  state-space  system  of  the  form  (2)  with 
input  sequence  {u(n)},  and  output  noise  sequence  equal  to  the  input 
sequence, 


w(n)  =  u(n)  V  n 

This  implies  complete  correlation  between  the  input  and  output 
noise  sequences  in  the  SVM  (as  in  the  case  of  the  MA  and  the  AR 
models)  .  Model  parameters  (F,  g,  tl/  d)  are  defined  as 
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d*  =  b’o 

Here,  as  in  the  AR  case,  vector  a  has  elements  equal  to  the  AR 
recursion  coefficients, 

a^  =  [a;  a*  ...  a;^] 

ij 

and  vector  Jl  has  elements  defined  by  M  of  the  MA  recursion 
coefficients, 
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b”=[bi  b*2  ..  bi,] 

The  system  parameters  quadruple  (F,  g.,  il,  d) ,  is  in  controllable 
canonical  form,  as  in  the  MA  and  AR  model  cases. 

State  vector  initial  conditions,  J£.(0),  for  this  case  are 
related  to  the  input  and  output  sequences  in  a  more  complex 
manner,  and  have  to  be  selected  appropriately.  Once  the  initial 
condition  vector  is  defined,  the  state  propagation.  Equation  (A- 
2a),  provides  for  continued  generation  of  the  output  process. 

The  simplest  approach  to  validate  this  model  is  via  the 
transfer  function  approach.  Application  of  the  Z-transform  co  the 
definition  of  the  ARMA  model  results  in  the  expression 

i  a;2-^X{z)  =  i  b;2*U(z) 

k«0  k*0 


where,  as  before,  X(z)  and  U{z)  are  the  Z-transforms  of  the 
sequences  {x(n)}  and  {u(n)),  respectively,  and  Rq ~ ^  introduced  for 

notational  simplicity.  The  transfer  function  for  this  linear 
system  is  then  defined  as 


T(z)  = 


M  M 

X(z)  _  k-0  _  k-0 _ 

U(z)  "  M  M 

X  <2’''  X 

k«0  k«0 


where  the  two  polynomial  ratio  expressions  (corresponding  to 
inverse  powers  of  Z  or  direct  powers  of  Z)  are  equivalent,  as 
indicated.  This  is  a  transfer  function  with  both  poles  and  zeros, 
as  expected  for  an  ARMA  process. 
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Consider  now  the  transfer  function  for  the  state  variable 
model  (A-2)  .  For  an  ARMA  process  the  system  characteristic 
polynomial  is 

7(z)  =  +  a^z^-i  + . . .  + 


which  is  equal  to  that  for  an  AR  process  SVM  model.  As  in  the 
other  two  cases, 

[zi  -  Fl-’a  =  fi(z) 

7(2) 

given  the  particular  features  of  matrix  F  and  vector  fl,  (fi(Z)  is  as 
defined  previously) .  Notice  also  that,  as  in  the  AR  process  case, 
the  characteristic  polynomial  can  be  expressed  as 

7(z)  *  z^  +  a^Q(z) 


Substitution  of  these  expressions  and  of  and  d  in  the  equation 
for  the  transfer  function  leads  to  the  following  result 


T(z)  = 


h^Q(z)  ^•d*7(2)  ^  (b”-b*oa*^)Q(z)  -t-boTlz) 
7(2)  7(2) 


bpzM  b^a(2) 
7(2) 


It  is  easy  to  verify  that  this  result  is  identical  to  the  transfer 
function  expression  derived  from  the  definition  of  the  ARMA 
process.  That  is. 


M 


TM  -  is*' 

^  ■  7(2) 


M 


I  a:zM-k 


k>0 
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where  Eq =  1 ,  as  before. 


A.  4  Medela  fQr__VeetQr  Raeuraive  Proeasaea 

Vector  recursive  processes  of  the  MA,  AR,  and  ARMA  type  can 
be  represented  with  SVMs  of  the  type  given  herein.  For  vector 
recursive  p’’  cesses  the  appropriate  notation  is: 

MA  2<(n)  =  BQU(n)  +  Byu(n-1 )  +  B2U(n-2)  +  . . .  +  Bjjii(n-M) 

AR  x(n)  =  -  A!j^2i(n-1 )  -  A2X(n-2)  -  ...  -  A|j]^(n-M)  +  ii(n) 

ARMA  2S(n)  =  -  A*j*i(n-1 )  -  ...  -  AU.i2i(n-M+1 )  -  A]Jx(n-M)  +  BoU(n)  +  BVil(n-1 )  + 

+  B2ii(n-2)  +  . . .  +  B|Jii(n'M) 

where  each  of  the  coefficient  matrices  is  dimensioned  JxJ ,  Also 
analogous  to  the  scalar  case,  the  corresponding  transfer  function 
matrices  can  be  defined  using  the  2-transform;  which  leads  to 

Tma(z)  =  B(z) 

T^p(z)  =  A-’(z) 

^arma(^)  =  a  \z)  B(z) 

where  A(z)  and  B(2)  are  the  following  matrix  polynomials  in  Z, 

A(2)  =  X  Al^z-k 

k«0 

B(z)  =  XbJz''' 

k>0 
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with  Aq  the  JxJ  identity  matrix.  The  matrix  pair  {A(z).  B(z)} 
(including  the  cases  with  either  A(z)  =  I  or  B(2)  =  I)  corresponding  to 
a  linear  discrete-time  system  is  referred  to  as  a  matrix 
polynomial  representation  or  a  matrix  fraction  description  (MFD) 
for  the  system. 

Departing  from  the  time-domain  definition  for  the  vector 
recursive  processes,  the  SVM  for  each  of  the  three  processes  is  of 
the  same  form  as  the  corresponding  scalar  case  SVM,  with  the 
following  changes :  a  JxJ  coefficient  matrix  in  place  of  the 
corresponding  coefficient  scalar,  a  JxJ  identity  matrix  (Ij)  in 
place  of  each  unit  scalar,  and  a  JxJ  null  matrix  (Oj)  in  place  of 

each  zero-valued  scalar.  Specifically,  the  SVM  for  the  ARMA 
vector  process  is: 
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The  SVM  for  the  other  vector  processes  (MA;  AR)  is  obtained  by 
substituting  the  correct  values  for  the  vector  process 
coefficients  in  the  above  system  parameters  (that  is,  Aj  =  Oj  for  an 
MA  process;  and  Bq  =  Ij  and  Bj  =  Oj,  i>1  for  an  AR  process).  In  all 
cases,  the  transfer  function  matrix  is  obtained  from  the  SVM 
representation  as 

T(2>*  H^[2I  -  F]'^G  +  D” 

A  transfer  function  calculated  according  to  this  relation  is 
equivalent  'to  the  transfer  function  calculated  from  the 
appropriate  polynomial  matrices. 

The  order  (dimension  of  the  state  vector)  of  the  resulting 
SVM  for  each  of  the  three  vector  processes  is  N  =  MJ,  since  for 
each  process  the  system  matrix  F  consists  of  M  block  rows  and  M 
block  columns,  where  each  block  in  each  row  and  column  is  a  JxJ 
matrix.  SVM  order  is  important  for  practical  and  computational 
considerations.  An  SVM  representation  is  of  minimal  order  if  no 
other  SVM  representation  of  lower  order  leads  to  the  same  transfer 
function  matrix.  In  terms  of  the  system  parameters  (F,  G,  H,  D)  , 
the  order  of  the  SVM  representation  is  determined  by  the  rank  of 
the  controllability  matrix  or  the  rank  of  the  observability 
matrix,  whichever  is  smaller.  Given  the  form  of  the  matrix  pair 
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(F,G)  for  all  three  cases,  it  is  easy  to  verify  that  the 
controllability  matrix  has  full  rank  for  all  three  cases. 
However,  the  observability  matrix  has  a  simple  form  only  for  the 
MA  SVM.  The  special  form  of  the  observability  matrix  for  the  MA 
case  considered  herein  (with  a  square  matrix)  indicates  by 

inspection  that  the  rank  of  the  observability  matrix  is  equal  to 
MJ  if  and  only  if  matrix  has  full  rank.  Such  a  simple  result 

is  not  available  for  the  AR  and  the  ARMA  SVMs .  Determination  of 
the  conditions  on  the  coefficients  of  the  polynomial  matrices  A(z) 
and  B(2)  for  AR  and  ARMA  vector  processes  that  lead  to  an  SVM 
representation  of  minimal  order  is  a  difficult  problem.  This  is 
due  to  the  fact  that  both  .\R  and  the  ARMA  vector  processes  lead  to 
a  transfer  function  matrix  with  elements  which  are,  in  general,  a 
ratio  of  polynomials  in  2. 

Model  order  and  related  issues  for  matrix  polynomial 
representations  have  been  discussed  by  several  researchers.  The 
results  summarized  next  are  available  in  the  text  by  Rosenbrock 
(1970).  Consider  the  matrix  polynomial  representation  of  a 
system,  and  assume  that  the  determinant  of  A(2)  is  different  from 
zero  to  eliminate  pathological  cases.  For  an  AR  vector  process, 
the  order  of  the  system  is  given  by  the  degree  of  the  determinant 
of  A(z).  Thus,  the  SVM  representation  presented  herein  for  vector 
AR  processes  is  of  minimal  order  if  the  determinant  of  A(2)  (with 
Ao  =  Jj)  has  degree  equal  to  MJ . 

Several  definitions  need  to  be  introduced  prior  to  stating 
the  relevant  results  regarding  minimal  order  for  ARMA  vector 
processes.  A  square  polynomial  matrix  is  said  to  be  regular  when 
the  matrix  coefficient  of  the  highest  power  of  2  is  non-singular. 
The  determinant  of  a  regular  polynomial  matrix  has  maximum 
possible  degree.  A  square  polynomial  matrix  is  said  to  be 
\inimQdular  if  its  determinant  is  a  non-zero  constant.  Unimodular 
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polynomial  matrices  have  an  inverse  which  is  also  a  polynomial 
matrix.  As  an  example,  the  polynomial  matrix 


Q(z)  =  Qo  +  Q,2-' 


1  +  z"’  3  +  z'^ 

2  +  z*^  4  +  . 


is  unimodular  because  the  determinant  of  Q(z)  is  equal  to  -2. 
Notice  that  the  inverse  of  Q{z)  is  aj.so  a  polynomial  matrix, 


4  +  z‘^ 

.-(a  +  z'M 


-O  +  z-M' 

1  +z’ 


as  expected.  Notice  also  that  Q(z)  is  not  a  regular  matrix  since 
Qi  is  singular. 

Two  polynomial  matrices  A(z)  and  B(z)  are  said  to  have  a  common 
left  divisor  S(z)  if 

A(z) »  S(z)Pa(2) 

B(z)  =  S(z)Pb(2) 

where  S(2),  Pa(2),  and  Pb(2)  are  polynomial  matrices.  Finally,  if 
all  the  common  (left)  divisors  of  two  polynomial  matrices  A(2)  and 
B(2)  are  unimodular,  then  the  two  matrices  are  said  to  be 

_ (Ig.fJ:) _ BCimg  •  That  is,  if  A(z)  and  B(2)  are  relatively 

(left)  prime,  then  the  determinant  of  the  polynomial  matrix  S(z)  in 
the  above  factorizations  is  a  constant.  This  implies  that  the 
degree  of  the  determinant  of  Pa(z)  is  equal  to  the  degree  of  the 
determinant  of  A(2),  and  the  degree  of  the  determinant  of  Pb(2)  is 
equal  to  the  degree  of  the  determinant  of  B(2).  Furthermore,  the 
determinant  of  Pa(z)  has  no  polynomial  factors  in  common  with  the 
determinant  of  Pb(z)  •  A  matrix  polynomial  pair  (A(2).  B(z))  with  A(2) 
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and  B(2)  relatively  (left)  prime  is  an  irreducible  matrix 
polynomial  representation  for  the  system. 

The  relevant  results  for  ARMA  vector  processes  can  be  stated 
now.  As  in  the  AR  case,  for  an  ARMA  vector  process  the 
determinant  of  A(z)  (with  Aq  =  Ij)  must  have  degree  equal  to  MJ  for 

the  SVM  representation  presented  herein  to  be  of  minimal  order. 
However,  two  additional  conditions  must  be  satisfied.  Namely, 
matrix  Bm  must  have  full  rank,  and  the  polynomial  matrices  A(z)  and 
B(2)  must  be  relatively  (left)  prime.  Full  rank  for  matrix  B|^ 
implies  that  B(z)  is  a  regular  polynomial  matrix.  If  A(z)  and  B(z) 
are  not  relatively  prime,  then  the  order  of  the  system  is  reduced 
by  the  degree  of  the  determinant  of  the  greatest  common  (left) 
divisor  of  A(z)  and  B(z).  This  is  related  to  the  so-called 
pole/zero  cancelations. 
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APPENDIX  B.  DETERMINISTIC  REALIZATION  ALGORITHMS 

Deterministic  realization  algorithms  are  of  relevance  in  this 
work  because  they  provide  insight  into  similar  algebraic  issues 
associated  with  stochastic  realization  problems  due  to  the 
similarities  in  the  factorization  of  the  deterministic  and 
stochastic  Hankel  matrices.  Also,  deterministic  realization 
algorithms  can  be  applied  to  obtain  the  matrix  triple  (F,  T,  H)  of 

the  innovations  representation.  However,  stochastic  realization 
algorithms  (such  as  the  canonical  correlations  algorithm  of 
Section  3.0)  are  preferred  because  the  state  correlation  matrix, 
n,  is  identified  also. 

Two  specific  realization  algorithms  are  presented  below:  Ho ' s 
algorithm  and  an  algorithm  using  the  singular  value  decomposition 
(SVD)  .  Both  algorithms  are  based  on  algebraic  and  factorization 
properties  of  the  deterministic  Hankel  matrix  for  a  discrete-time, 
time-invariant,  linear  system,  as  summarized  next. 

B  •  1  Ofitermiaiatig _ Hankel _ Matrix  Properties 

Consider  a  discrete-time,  time-invariant.  Nth-order  system  of 
the  form  (2-2)  where  the  state,  input,  and  output  vectors  are 


deterministic  and  D^  =  [0], 

(B-la) 

^(n+1)  =  F/{n)  +  Gu(n) 

n  >  n, 

(B-lb) 

X{n)  =  H\(n) 

n  >  n, 

(B-lc) 

o 

II 

o 

c 

As  in  the  rest  of  this  report,  the  input  and  output  vectors  are  J- 
dimensional  (in  the  general  case  the  dimension  of  the  input  vector 
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can  be  different  from  the  dimension  of  the  output  vector) .  System 
(B-1)  is  assumed  to  be  completely  reachable  and  completely 
observable,  and  thus  has  minimal-order.  Complete  reachability  and 
observability  also  imply  that  the  NxJL  controllability  matrix 

(B-2)  CL  =  [G  FG  •••  F^'^G  ]  L>N 
and  the  JLxN  observability  matrix 


both  have  rank  equal  to  the  system  order,  N. 

The  JLxJL  deterministic  block  Hankel  matrix  for  system  (B-1) 
consists  of  JxJ  block  elements,  with  impulse  response  matrices 
{A(n)}  assigned  as  the  JxJ  block  elements  according  to  the  rule 

(B-4)  HL  ^block  i,  block  j)  =  A(i+j-1 )  i,j  =  1,2 . L 

for  L  >  N .  In  expanded  form,  matrix  H|_l  is 

A(1)  A(2)  •••  A^L) 

A(2)  A(3)  ...  A(L+1) 

(B-5)  Hl_l  =  . 

A(L)  A(L+1)  ...  A(2L-1)  . 


Block  Hankel  matrices  are  block  symmetric,  but  not  element-by¬ 
element  symmetric,  in  general.  The  impulse  response  sequence  {A(n)} 


for  system  (B-1)  is  given  in  terms  of  the  system  parameter 
matrices  as 

(B-6)  A(n)  =  HV’’G  n>1 

Using  the  Cayley-Hamilton  theorem  it  is  easy  to  show  that  the 
impulse  response  sequence  satisfies  a  set  of  recursion  relations 
of  the  form  (Kalman  et  al.,  1969), 

(B-7 )  A(N+k)  =  -  a* A(N+k-1 .,A(k+1 )  -  aJjA(k)  k  >  1 

where  {aj  }  are  the  coefficients  of  the  Nth-order  characteristic 
equation  of  matrix  F.  Some  systems  have  a  minimal  polynomial  of 
degree  r<N.  For  those  systems  an  rth-order  set  of  recursion 
relations  of  the  form  (B-7)  are  valid.  However,  for  those  systems 
the  set  of  recursion  relations  (B-7)  based  on  the  characteristic 
polynomial  are  valid  also. 

Inspection  of  Equations  (B-2)-(B-6)  indicates  that  matrix 
admits  a  factorization  of  the  form 

(B-8)  ^L,L  ~ 

Given  this  factorization  and  the  fact  that  for  L>N  matrices 
and  Ol  both  have  full  rank  equal  to  N,  it  follows  from  Sylvester's 

inequality  for  the  rank  of  the  product  of  two  rectangular  matrices 
(Gantmacher,  1960)  that  the  rank  of  matrix  Hll  is  equal  to  N. 

In  fact.  Equation  (B-8)  also  implies  that 


(B-9) 


rank(HN^^_N+^^)  =  N 


k>  1 


for  a  system  (B-1)  of  order  N.  Equation  (B-7)  and  the  block 
Hankel  structure  (the  sequential  arrangement  of  the  matrices  {A(n)} 
as  block  elements  of  imply  that  the  block  columns  (rows)  of 

also  satisfy  a  recursion  of  the  form  (B-7) . 

A  column-shifted  deterministic  block  Hankel  matrix,  denoted 
as  Hn_,  is  defined  by  deleting  the  first  block  column  of  H|_l  and 

inserting  a  new  block  column  in  a  manner  such  as  to  preserve  the 
Hankel  structure  (the  same  result  is  obtained  by  deleting  the 
first  block  row  and  adding  a  new  block  row) ;  that  is. 


A(2)  A(3)  •••  A(U1) 


(B-IO)  H,  .  = 


A(3)  A(4) 


A(U2) 


A(L+1 )  A(L+2) 


A{2L) 


The  significance  of  the  shifted  block  Hankel  matrix  is  the  form  of 
its  factorization,  as  indicated  in  Equation  (B-10)  .  The 
factorization's  in  Equations  (B-8)  and  (B-10)  are  the  basis  for  the 
realization  algorithms  presented  herein,  as  well  as  others. 


Consider  the  block  Hankel  matrix  l  of  Equation  (B-5)  . 

Apply  a  sequence  of  elementary  right  and  left  matrix  operations 
(transformations)  to  the  Hankel  matrix  Hll  in  order  to  drive  it  to 

diagonal  form,  with  unity  elements  along  the  diagonal.  That  is. 


(B-11)  = 


'N.JL-N 


OjL-N,N  OjL-MJL-N. 


Here  and  T2  are  non-singular  matrices  which  represent  the 

product  of  all  the  column  operations  and  row  operations, 
respectively,  required  to  transform  Hj_  ^  into  diagonal  form  (B-11)  . 

It  is  always  possible  to  carry  out  such  elementary  transformations 
because  matrix  Hj_j_  has  r--.K  N.  It  follows  from  Equations  (B-8) 

and  (B-11)  that 


IB-12)  =  (TjOi.)  = 


'N 


L  K)L-HN 


[  ^N.JL-N  ] 


The  explicit  factorization  of  the  diagonal  matrix  in  Equation  (B- 
12)  indicates  that  T2  transforms  the  observability  matrix  into  a 

matrix  with  unity  elements  along  the  main  diagonal  and  zeros 
elsewhere.  Likewise,  transforms  the  controllability  matrix  into 

a  matrix  with  unity  elements  along  the  main  diagonal  and  zeros 
elsewhere . 


Given  the  factorizations  in  Equation  (B-12),  matrix  G  is 
obtained  as  the  NxJ  upper-left-hand  submatrix  of  T2Hl|_, 


(B— 13a)  "^2^L  L  ”  (Ts ^l)  ~ 


'N 


<JL-N.N  J 


[G  FG  F^’’G  ’ 


(B-13b)  T2HLL  = 


G  FG  •••  F^'^G 


Similarly,  matrix  H  is  obtained  as  the  JxN  upper-left -hand 
submatrix  of  Hn_T^  , 


1  14 


(B-14a)  Hl.J?  =  q.(cj^)  = 


H»F 


Hr-L-I 


H"F 


[  In  ®n.jl-n  ] 


(B-14b)  = 


H^F 


hHfL-1 


■^JL.JL-N 


Finally,  it  follows  from  Equations  (B-iO)  and  (B-12)  that  matrix  F 


is  obtained  as  the  NxN  upper-left-hand  submatrix  of  T2Hl_lT^  , 


(B-15a)  TgH^LTl^  =  (TaOjlFHcjn  = 


'N 


<JL.N,N 


[F]  [  In  ^n,jl-n  ] 


(B-15b)  TgHLLTj^  = 


^  '“'N.JL-N 

L^JL-HN  ^JL-HJL-N 


This  completes  Ho '  s  algorithm  for  the  determination  of  a  matrix 
triple  (F,  G,  H)  which  realizes  an  impulse  response  matrix 
sequence  {A(n)} . 

In  the  above  discussion  it  is  assumed  implicitly  that  the 
given  impulse  response  matrix  sequence  corresponds  to  an  Nth-order 
system  of  the  form  (B-1)  ,  and  that  the  sequence  is  available 
without  distortions  due  to  noise  or  other  such  effects.  If  either 
of  these  two  conditions  is  not  satisfied,  then  the  diagonalized 
block  Hankel  matrix  (Equation  (B-11))  will  have  non-zero  elements 
beyond  the  Nth  diagonal  location.  In  such  cases,  model  order  is 
estimated  by  determination  of  the  diagonal  location  beyond  which 
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the  diagonal  elements  represent  zero  or  represent  the  contribution 
of  noise.  This  requires  appropriate  selection  of  the  block 
dimension  of  the  Hankel  matrix  (L  must  be  sufficiently  large) . 
One  mechanism  for  verifying  the  cut-off  diagonal  element  is  to 
examine  the  norm  of  the  transformation  matrices.  But  this  implies 
a  large  computational  load.  Other  alternative  model  order 
selection  criteria  have  difficulties  also.  These  difficulties 
arise  because  the  Hankel  matrix  is  transformed  to  an  identity,  and 
because  no  constraints  are  imposed  on  the  norm  of  the  columns  of 
the  transformation  matrices. 

B  .  3  SVD-Baaed _ Realization  Algorithm 

An  alternative  implementation  of  Ho '  s  algorithm  has  been 
proposed  by  Zeiger  and  McEwen  (1974) .  Instead  of  carrying  out 
elementary  row  and  column  operations  on  the  Hankel  matrix 
(Equation  (B-11)),  Zeiger  and  McEwen  (1974)  propose  a  singular 
value  decomposition  of  the  Hankel  matrix.  This  offers  two 
important  advantages:  first,  the  SVD  is  numerically  robust  even 
for  matrices,  of  large  dimensions;  second,  the  SVD  provides  an 
inherent  mechanism  for  the  determination  of  the  model  order,  or 
determination  of  the  best  model  fit  for  a  selected  model  order 
(herein  best  is  intended  in  the  sense  of  minimizing  the  Frobenius 
norm  of  the  difference  between  the  given  Hankel  matrix  and  the 
Hankel  matrix  that  corresponds  to  the  selected  model  order) . 
Golub  (1969)  provides  a  good  summary  of  the  SVD,  its  properties, 
and  its  applications. 

Consider  the  block  Hankel  matrix  H|_  [_  of  Equation  (B-5)  .  The 
singular  value  decomposition  (SVD)  of  Hn_  is  a  factorization  of 
the  form 

{B-16)  H|_l  = 
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where  and  Tg  are  unitary  matrices,  and  ^JL  is  a  diagonal  matrix 

with  non-negative,  real-valued  diagonal  elements  arranged  in  order 
of  descending  magnitude.  That  is,  matrix  Ajl  is  of  the  form 


(B-17a)  Ajl 


^N.JL-N 

^JL-N.N  ^JL-MJL-N. 


(B-17b) 


8i  0  ...  0  0 

0  82  •••  0  0 

0  0  0 

0  0  •••  0  5^, 


<B-I7c)  5^  >  §2  >  . . .  >  8n-i  ^  >0 


The  factorization  in  Equation  (B-16)  is  a  generalization  of  the 
concept  of  the  eigenvector/eigenvalue  decomposition  of  a  matrix, 
with  the  property  that  it  is  applicable  also  to  non-square 
matrices.  This  decomposition  is  unique  (except  possibly  for  sign 
changes  to  the  columns  of  the  unitary  matrices  and  Tg)  .  The 
diagonal  elements  of  Ajj_  are  referred  to  as  the  singular  values  of 
and  the  rank  of  matrix  l  is  equal  to  the  number  of  non-zero 
singular  values.  The  columns  of  Tg  are  the  left  singular  vectors, 
and  the  columns  of  T^  are  the  right  singular  vectors  of  Hn_.  For 
an  Nth-order  system  in  noise-free  conditions,  there  are  N  non-zero 
singular  values,  and  JL  -  N  zero-valued  singular  values. 


In  the  case  where  the  matrix  to  be  decomposed  is  Hermitian 
(not  just  block  Hermitian),  the  SVD  is  an  eigendecomposit  ion . 
That  is,  for  a  Hermitian  matrix,  ~  "^8  “  columns  of  T 
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(singular  vectors)  are  the  eigenvectors,  and  the  singular  values 
are  the  eigenvalues. 

Another  important  property  of  the  SVD  is  that  it  provides  a 
means  for  determining  a  JLxJL  matrix  M  of  specified  rank  k  (with  k 

<  N)  which  best  approximates  the  Hankel  matrix  in  the  sense  of 

minimizing  the  Frobenius  norm  of  the  difference  between  the  Hankel 
matrix  and  the  desired  matrix  M.  The  desired  optimal  matrix 

approximation  is  of  the  form  (B-16),  with  the  modification  that 
only  the  first  k  diagonal  elements  of  submatrix  (Equation  (B-17) 

are  retained,  and  the  diagonal  elements  beyond  the  kth  one  are  set 
to  zero. 

Equation  (B-16)  can  be  factorized  further  by  taking  the 
matrix  square  root  of  Ajj_  to  obtain  (since  Aj^  is  diagonal,  its 

matrix  square  root  is  trivial) 

(B-18)  =  TbAjJ”  =  (iBAifllAif  T»)  = 

Given  the  explicit  factorization  in  Equation  (B-18),  it  follows 

from  Equation  (B-2)  that  matrix  G  is  given  by  the  NxJ  upper-left- 

1/2  H 

hand  submatrix  of  ^jlTa;  similarly,  from  Equations  (B-3)  and  (B- 

i_j 

18),  it  follows  that  matri.x  H  is  given  by  the  JxN  upper-left-hand 
submatri.x  of  TgAy^. 

Matrix  F  is  obtained  using  Equations  (B-10)  and  (B-16)-(B- 
18)  .  Specifically, 

(B-19)  F  =  A,!,^  *^N,JL-N  ]  ^L.lTa 

.9jl-n,n  . 


1  1  8 


This  completes  the  SVD-based  algorithm  for  the  determination  of  a 
matrix  triple  (F,  G,  H)  which  realizes  an  impulse  response  matrix 
sequence  {A(n)} . 

As  before,  the  above  development  assumed  that  the  available 
impulse  response  matrix  sequence  corresponds  to  an  Nth-order 
system  of  the  form  (B-1)  ,  and  that  the  sequence  is  available 
without  distortions  due  to  noise  or  other  such  effects.  If  either 
of  these  two  conditions  is  not  satisfied,  then  there  will  be  more 
than  N  non-zero  singular  values  for  the  block  Hankel  matrix 
(Equations  (B-16)  and  (B-17)).  In  such  cases,  model  order  is 
estimated  by  determination  of  the  diagonal  location  beyond  which 
the  singular  values  represent  zero  or  represent  the  contribution 
of  noise.  This  requires  appropriate  selection  of  the  block 
dimension  of  the  Hankel  matrix  (L  must  be  sufficiently  large) . 

In  the  SVD-based  algorithm,  the  columns  of  matrices  and  Tg 

have  unity  norm  (such  is  not  the  case  for  Ho's  algorithm).  That 
is,  the  magnitude  of  each  singular  value  is  representative  of  the 
importance  of  the  contribution  (in  the  sense  of  the  Frobenius 

norm)  of  the  singular  value  to  the  numerical  representation  of  the 
operator  Thus,  model  order  determination  using  the  singular 

values  has  a  firm  numerical  and  algebraic  foundation,  and  can  be 
carried  out  once  the  SVD  is  computed. 


APPENDIX  C.  COMPLEX-VALUED  CANONICAL  CORRELATIONS 


Hotelling  (1936)  introduced  the  concept  of  canonical 
variables  and  canonical  correlations  to  establish  a  canonical 
relationship  between  two  sets  of  random  variables  (or  between  two 
random  vectors) .  In  linear  algebra,  the  term  "canonical"  is  used 
to  denote  the  element  of  an  equivalence  class  which  is  represented 
with  the  minimum  number  of  non-zero  independent  parameters .  For 
example,  the  Jordan  form  is  a  canonical  form  for  the  equivalence 
class  of  square  matrices  under  a  similarity  transformation.  As 
defined  by  Hotelling  (1936),  the  canonical  variables  embody  the 
essence  of  the  correlation  structure  among  the  random  variables  of 
the  two  given  sets. 

The  canonical  variables  formulation  is  presented  herein  for 
the  special  case  where  the  dimension  of  the  two  random  vectors 
(the  number  of  variables  in  each  set)  is  the  same  because  that  is 
the  case  in  the  context  of  the  multichannel  detection  application. 
Extension  to  the  general  case  where  the  two  vectors  have  different 
dimensions  is_ straightforward. 

Consider  two  complex-valued,  zero-mean,  L-dimensional  random 
vectors  i  and  V  with  auto-  and  cross-correlation  matrices  defined 
as 

(C-1)  R^z  = 

(C-2)  =  E[^vH] 

(C-3)  R^v  =  E[jv^] 

(C-4)  Rv2  = 
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The  rp^nonical  variables  ror  i  and  v  are  two  complex-valued,  zero- 
mean,  L-dimensional  random  vectors 

(C-5)  U  =  [^i 

(C-6)  fi  =  [3i  32... 

such  that  the  following  conditions  are  satisfied: 

i) 

ii)  ^ 

iii)  II.,  and  3.,  have  unit  variance  and  are  maximally 

correlated,  with  correlation  coefficient  p.,  . 

iv)  for  i  ^  L,  |lj  and  3j  have  unit  variance  and  are  maximally 

correlated,  with  correlation  coefficient  p,; 
furthermore,  p,  is  uncorrelated  with  p,..,,  Pj.gr  .  .  .  ,  p.,, 
and  3j  is  uncorrelated  with  3j..,,  3j.2,  .  .  .  ,  3^ . 

v)  1  >  p.,  >  P2  S  .  .  .  >  Pl  >  0 

The  two  linear  transformations  and  T2  introduced  in  conditions 
(i)  and  (ii)  are  complex-valued,  full-rank,  LxL  matrices. 

Condition  (v)  implies  that  the  positive-valued  correlation 
coefficients  are  selected  (the  sign  of  the  rows  of  matrices  T.,  and 
T2  can  be  selected  in  all  cases  such  that  the  correlation 
coefficient  of  two  random  variables  p,  and  3,  is  positive-valued) . 
The  correlation  coefficients  {Pi)  are  the  canonical  correlations  for 
Z  and  y.  Since  the  canonical  variables  {p,}  and  {3,}  are  covariance- 

normalized,  their  correlation  coefficients  are  less  than  or  equal 
to  unity,  as  indicated  in  condition  (v) . 
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Conditions  (iii)-(v)  can  be  expressed  in  compact  form  using 
Conditions  (i)  and  (ii)  and  Equations  (C-l)-(C-4), 

<c-7)  EIuuH)  =  Ij  =  T,R„t" 

(C-8)  E[SJJH)  =  Ij  =  TjFUTJ* 

(C-91  ECail”!  =  -  TjR„T," 

Pi  0  ...  o' 

0  p2  •  0 

0  0  ••.  Pl, 

(C-lOb)  1  >  p,  >  p2  >  .  .  .  >  Pl  >  0 

Equations  (C-7)-(C-10)  constitute  an  analytic  formulation  of  the 
canonical  correlations  problem.  Golub  (1969)  has  shown  that  the 
solution  for  this  problem  in  the  case  of  real-valued  variables  can 
be  obtained  using  the  singular  value  decomposition  (SVD) .  The 
extension  to  the  case  of  complex-valued  variables  is 
straightforward,  as  carried  out  herein. 

The  first  step  in  the  development  is  to  determine  the  matrix 
square  root  of  each  of  the  correlation  matrices  R22:  and  Ryv  ^ 

matrix  square  root  for  a  correlation  matrix  can  be  calculated 
using  any  one  of  several  methods,  and  all  methods  lead  to 
equivalent  results  in  the  context  of  the  problem  at  hand. 
Alternative  methods  include  the  Cholesky  decomposition  and  the 
SVD.  Of  the  alternative  methods,  the  SVD  method  is  preferred 
herein  because  of  its  robust  numerical  properties,  and  because  the 


(C-lOa)  R;^  = 
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inverse  of  the  square  root  matrix  is  determined  easily  given  the 
SVD  of  the  matrix.  Additionally,  the  SVD  is  used  for  another 
purpose  in  the  realization  algorithm.  Thus,  the  matrix  square 
roots  of  correlations  matrices  R22  and  are  obtained  using  the 

SVD  as 

(C-11)  R„  =  R^^ 

{C-12)  Ryy  =  UySyU^  =  ( U  y  U  ”)  (UySi^^  U  R^y^^ 

Now  transform  the  random  vectors  2  and  ^  to  define  two  correlation- 
normalized  random  vectors  as 

(c-13)  a  = 

(C-14)  y  =  =  UyS-J'^uf 

Given  these  definitions,  it  is  easy  to  show  that 
(C-15)  E[g.g^l  =  Ij 

(C-16)  E[yy^]  =  Ij 

{C-17)  ElYe^^]  =  R^e  =  Rw^Ry,  R'i'^ 

Equations  (C-15 )- (C-17 )  are  similar  to  Equations  (C-7)-(C-9),  but 
it  is  incorrect  to  assume  that  Equations  {C-13)  and  (C-14)  define 
the  desired  canonical  transformations,  and  that  g  and  y  are  the 

desired  canonical  variables.  Variables  g  and  y  are  not  the 
canonical  variables  because  their  correlation  matrix,  R.^,  is  not 
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in  diagonal  form.  However,  the  variables  S  and  7  constitute  an 
important  intermediate  transformation. 

Consider  now  the  cross-correlation  matrix  of  Equation  (C- 
17) ,  and  carry  out  an  SVD  on  it  to  obtain 

(C-18)  R^e  =  UpAp|Vp 

Ur  and  Vr  are  unitary  LxL  matrices,  and  Ar  is  an  LxL  diagonal 

matrix  with  non-negative  elements  along  the  diagonal.  The 
diagonal  elements  of  are  bounded  by  unity  and  zero,  and  are 

arranged  in  order  of  decreasing  magnitude,  with  the  largest  at  the 
(1,1)  location: 


(C-19a)  Ar  = 


Si  0  ...  0  0 

0  82  •••  0  0 

0  0  •••  5l.i  0 

0  0  •  •  •  0  5l 


{C-19b)  1  >  5^  >  82  >  .  .  .  >  5l  >  0 


Given  the  decomposition  in  Equations  (C-18)  and  (C-19)  ,  and  the 
given  the  relations  in  Equations  {C-13 ) - (C-17 ) ,  the  desired 
canonical  variables  and  canonical  correlations  are  obtained  as 


(C-20) 

U  =  vSq  = 

(C-21) 

(C-22a) 

II 

ir 
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(C-22b) 


Pi  =  5i 


i  =  1, 2 . L 


Since  matrices  Uq  and  Vq  are  unitary,  the  norms  of  vectors  and  Ji 
are  equal  to  the  norms  of  vectors  Q.  and  7,  respectively.  From 

relations  (C-20)  and  (C-21)  the  transformation  matrices  are 
determined  in  a  straightforward  manner  to  be 

(C-23)  Ti=vgRi'2 

(C-24)  = 

Direct  substitution  verifies  that  Equations  (C-7)-(C-10)  are 
satisfied  by  this  choice  of  transformation  matrices. 

An  important  relation  can  be  inferred  from  Equations  (C-17) 
and  (C-18 ) , 

(C-25)  Rvz  =  R'^  =  Rl?  Ur  A„vg 

This  relation  is  useful  in  the  validation  of  Equation  (C-9)  using 
the  transformation  matrices  in  Equations  (C-23)  and  (C-24) .  It  is 
useful  also  in  the  system  identification  (stochastic  realization) 
algorithm  of  Section  3.0. 
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